{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "bc3c50f0",
   "metadata": {},
   "source": [
    "### Introduction"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5769c09f",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "This tutorial shows how to use active learning to rapidly process a large dataset using a more computationally expensive method like docking, FEP, or 3D similarity calculations.  The code here is a scaled-down version of the code in our paper [\"Optimizing active learning for free energy calculations\"](https://www.sciencedirect.com/science/article/pii/S2667318522000204).  The code is fully functional.  I just removed a few of the options and benchmarks from our original work.  "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aec40556",
   "metadata": {},
   "source": [
    "### Installation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df5be2a6",
   "metadata": {},
   "source": [
    "Install the necessary Python libraries."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "4e0fdb79",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2025-05-05T22:00:43.275003Z",
     "start_time": "2025-05-05T22:00:43.273163Z"
    }
   },
   "outputs": [],
   "source": [
    "%%capture\n",
    "import sys\n",
    "IN_COLAB = 'google.colab' in sys.modules\n",
    "if IN_COLAB:\n",
    "    !pip install pandas numpy seaborn useful_rdkit_utils tqdm scikit-learn 'modAL-python>=0.4.1'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "bdc5358b-51a1-4b8c-bc5d-f484eb8eab6b",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from operator import itemgetter\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import seaborn as sns\n",
    "import sklearn.gaussian_process as gp\n",
    "import useful_rdkit_utils as uru\n",
    "from modAL.acquisition import BaseLearner\n",
    "from modAL.models import BayesianOptimizer\n",
    "from modAL.utils.data import modALinput\n",
    "from modAL.acquisition import optimizer_PI\n",
    "from rdkit import Chem\n",
    "from sklearn.gaussian_process import GaussianProcessRegressor\n",
    "from tqdm.auto import tqdm"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd186288",
   "metadata": {},
   "source": [
    "Grab the necessary data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "101b1f0a",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "if IN_COLAB:\n",
    "  import urllib.request\n",
    "\n",
    "  os.makedirs(\"./data\", exist_ok=True)\n",
    "  url = \"https://raw.githubusercontent.com/PatWalters/practical_cheminformatics_tutorials/main/active_learning/data/tyk2_fep.csv\"\n",
    "  filename = \"data/tyk2_fep.csv\"\n",
    "  urllib.request.urlretrieve(url,filename)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ead57862",
   "metadata": {},
   "source": [
    "### Defining an Oracle"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94f7e918",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "As with the classification example, we'll define an oracle that looks up values from a dataframe.  In practice, the oracle would perform some more expensive calculation like docking, FEP, or shape overlap. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "ed43e30d",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "class Oracle:\n",
    "    def __init__(self, df, col_name, invert=True):\n",
    "        self.df = df\n",
    "        self.col_name = col_name\n",
    "        if invert:\n",
    "            self.df[col_name] = self.df[col_name]\n",
    "\n",
    "    def sample(self, num):\n",
    "        sample_df = self.df.sample(num)\n",
    "        return sample_df.fp.values, sample_df[self.col_name].values, sample_df.index\n",
    "\n",
    "    def get_values(self, idx_list):\n",
    "        return df[self.col_name].values[idx_list]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "55e13d03",
   "metadata": {},
   "source": [
    "### Defining the Kernel Function for the Machine Learning Model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9909b693",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "When we do active learning, we have to define a machine learning models that acts as a surrogate for the more expensive calculations.  In this case, we're going to use Gaussian Process Regression (GPR) to build our regression models.  To use GPR, we need to define a kernel function. Here we calculate a kernel based on the Tanimoto similarities of the molecules."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "669dc440-93c9-4d29-8256-4249efb8c213",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "def calculate_similarity(a, b):\n",
    "    # Tanimoto similarity a vs. b\n",
    "    aa = np.sum(a, axis=1, keepdims=True)\n",
    "    bb = np.sum(b, axis=1, keepdims=True)\n",
    "    ab = np.matmul(a, b.T)\n",
    "    return np.true_divide(ab, aa + bb.T - ab)\n",
    "\n",
    "\n",
    "class TanimotoKernel(gp.kernels.NormalizedKernelMixin,\n",
    "                     gp.kernels.StationaryKernelMixin, gp.kernels.Kernel):\n",
    "\n",
    "    def __init__(self):\n",
    "        pass\n",
    "\n",
    "    def __call__(self, X, Y=None, eval_gradient=False):\n",
    "        assert not eval_gradient\n",
    "        if Y is None:\n",
    "            Y = X\n",
    "        return calculate_similarity(X, Y)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21cbb28d",
   "metadata": {},
   "source": [
    "### Reading the Data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e99227cf",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Enable progress bars for the Pandas apply function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "a1678470-9858-43e8-8132-d9518e1dcd9d",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "tqdm.pandas()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bae9d650",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Read the input data.  This data comes from [\"Optimizing active learning for free energy calculations\"](https://www.sciencedirect.com/science/article/pii/S2667318522000204)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "ce34db47-4911-41a0-9c04-5c95f56be258",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "df = pd.read_csv(\"data/tyk2_fep.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6a4b2af7",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Take a quick look at the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "e3c2c66d",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ligand_id</th>\n",
       "      <th>SMILES</th>\n",
       "      <th>dG_bind</th>\n",
       "      <th>dG_bind_err</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>O=C(Nc1ccnc(NC(=O)C2CCC2)c1)c1c(Cl)cccc1Cl</td>\n",
       "      <td>-2.995</td>\n",
       "      <td>0.455</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>O=C(Nc1ccnc(NC(=O)C2CCCC2)c1)c1ccccc1Cl</td>\n",
       "      <td>8.731</td>\n",
       "      <td>0.462</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>O=C(Nc1ccnc(NC(=O)C2CCC2)c1)c1ccccc1Cl</td>\n",
       "      <td>3.316</td>\n",
       "      <td>0.448</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>O=C(Nc1ccnc(NC(=O)C2CC2)c1)c1c(Cl)cc(Cl)cc1Cl</td>\n",
       "      <td>-0.070</td>\n",
       "      <td>0.462</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>O=C(Nc1ccnc(NC(=O)C2CC2)c1)c1cc(Cl)ccc1Cl</td>\n",
       "      <td>3.431</td>\n",
       "      <td>0.449</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9992</th>\n",
       "      <td>9994</td>\n",
       "      <td>COc1cc(Cl)c(C(=O)Nc2cc(Nc3cccc(C(N)=O)n3)ncc2F...</td>\n",
       "      <td>3.288</td>\n",
       "      <td>0.512</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9993</th>\n",
       "      <td>9995</td>\n",
       "      <td>COc1ccnc(Nc2cc(NC(=O)c3cccc(Cl)c3N)c(F)cn2)c1</td>\n",
       "      <td>15.143</td>\n",
       "      <td>0.478</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9994</th>\n",
       "      <td>9996</td>\n",
       "      <td>O=C(COCc1ccccc1)Nc1cc(NC(=O)c2cccc(Cl)c2)c(F)cn1</td>\n",
       "      <td>5.480</td>\n",
       "      <td>0.500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9995</th>\n",
       "      <td>9997</td>\n",
       "      <td>COc1c(F)cccc1C(=O)Nc1ccnc(NC(=O)NC(C)C)c1</td>\n",
       "      <td>10.696</td>\n",
       "      <td>0.467</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9996</th>\n",
       "      <td>9998</td>\n",
       "      <td>O=C(Nc1cc(Nc2cncc(CO)n2)ncc1F)c1cccc(Cl)c1F</td>\n",
       "      <td>2.736</td>\n",
       "      <td>0.474</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>9997 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      ligand_id                                             SMILES  dG_bind  \\\n",
       "0             0         O=C(Nc1ccnc(NC(=O)C2CCC2)c1)c1c(Cl)cccc1Cl   -2.995   \n",
       "1             1            O=C(Nc1ccnc(NC(=O)C2CCCC2)c1)c1ccccc1Cl    8.731   \n",
       "2             2             O=C(Nc1ccnc(NC(=O)C2CCC2)c1)c1ccccc1Cl    3.316   \n",
       "3             3      O=C(Nc1ccnc(NC(=O)C2CC2)c1)c1c(Cl)cc(Cl)cc1Cl   -0.070   \n",
       "4             4          O=C(Nc1ccnc(NC(=O)C2CC2)c1)c1cc(Cl)ccc1Cl    3.431   \n",
       "...         ...                                                ...      ...   \n",
       "9992       9994  COc1cc(Cl)c(C(=O)Nc2cc(Nc3cccc(C(N)=O)n3)ncc2F...    3.288   \n",
       "9993       9995      COc1ccnc(Nc2cc(NC(=O)c3cccc(Cl)c3N)c(F)cn2)c1   15.143   \n",
       "9994       9996   O=C(COCc1ccccc1)Nc1cc(NC(=O)c2cccc(Cl)c2)c(F)cn1    5.480   \n",
       "9995       9997          COc1c(F)cccc1C(=O)Nc1ccnc(NC(=O)NC(C)C)c1   10.696   \n",
       "9996       9998        O=C(Nc1cc(Nc2cncc(CO)n2)ncc1F)c1cccc(Cl)c1F    2.736   \n",
       "\n",
       "      dG_bind_err  \n",
       "0           0.455  \n",
       "1           0.462  \n",
       "2           0.448  \n",
       "3           0.462  \n",
       "4           0.449  \n",
       "...           ...  \n",
       "9992        0.512  \n",
       "9993        0.478  \n",
       "9994        0.500  \n",
       "9995        0.467  \n",
       "9996        0.474  \n",
       "\n",
       "[9997 rows x 4 columns]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4cec76c9",
   "metadata": {},
   "source": [
    "The data in the table is in kcal/mol, we'll convert to a pKd and call the new column \"Activity\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "7ab9248b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAekAAAHpCAYAAACmzsSXAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjcsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvTLEjVAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAKx9JREFUeJzt3Qt0VOW5//En3G9yCYYQlJtGBZSLJQpYpQIBRJpqSc8/tgGxB2mlQCu0lOYUEgwFXHgEjxRBuxDsQgpy0CqXUpCKrgpUjSIKCo5JhEK4tEi4LANJ2P/1vKczZWCCBJKZZybfz1p7TWbvPTPv7Ezym/fd737fOM/zPAEAAObUinQBAABAaIQ0AABGEdIAABhFSAMAYBQhDQCAUYQ0AABGEdIAABhFSIuIXip+/PhxdwsAgBWEtIicOHFCmjVr5m4BALCCkAYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMqhPpAgCIbmnpGVJ05GjIbUkJ8bJ61YqwlwmIFYQ0gCuiAZ08PDfkNt/SbI4ucAVo7gYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMIqQBADAqoiE9bdo0iYuLC1o6deoU2F5SUiJjx46Vli1bSpMmTSQ9PV0OHToU9Bx79+6VoUOHSqNGjaRVq1YyadIkKSsri8C7AQCgatWRCLv55pvl9ddfD9yvU+ffRZowYYKsXbtWVq5cKc2aNZNx48bJsGHD5O2333bby8vLXUC3bt1atmzZIkVFRfLggw9K3bp1ZebMmRF5PwAAxExIayhryJ6vuLhYFi1aJMuWLZP+/fu7dYsXL5bOnTvLtm3bpHfv3rJhwwbZtWuXC/nExETp0aOHTJ8+XSZPnuxq6fXq1Qv5mqdPn3aL3/Hjx6vxHQIAEKXnpD/77DNp06aNXHfddZKZmemar1VeXp6UlpZKampqYF9tCm/Xrp1s3brV3dfbrl27uoD2Gzx4sAvdnTt3Vvias2bNcjVz/9K2bdtqfY8AAERdSPfq1UuWLFki69evlwULFkhBQYHcddddcuLECTl48KCrCTdv3jzoMRrIuk3p7bkB7d/u31aRrKwsV1P3L/v27auW9wcAQNQ2dw8ZMiTwc7du3Vxot2/fXl566SVp2LBhtb1u/fr13QIAgGURb+4+l9aab7zxRvH5fO489ZkzZ+TYsWNB+2jvbv85bL09v7e3/36o89wAAEQTUyF98uRJ+fzzzyUpKUl69uzpemlv2rQpsH337t3unHWfPn3cfb396KOP5PDhw4F9Nm7cKE2bNpUuXbpE5D0AABATzd2/+MUvJC0tzTVxHzhwQHJycqR27dry/e9/33XoGjVqlEycOFHi4+Nd8I4fP94Fs/bsVoMGDXJhPGLECJk9e7Y7Dz1lyhR3bTXN2UDVSUvPkKIjR0Nuyy8olGQONhB7If33v//dBfI///lPSUhIkDvvvNNdXqU/q7lz50qtWrXcICZ6yZT23H7mmWcCj9dAX7NmjYwZM8aFd+PGjWXkyJGSm5sbwXcFxB4N6OThof+u9uRkhr08QE0R0ZBevnz5Rbc3aNBA5s+f75aKaC183bp11VA6AAAiy9Q5aQAA8G+ENAAARhHSAAAYRUgDAGAUIQ0AgFERnwULgI1rnpMS4mX1qhX8OgBDCGmghqnommff0mwTg6PwZQH4N0IagJPv80lK34FhGVXsYoOjhPPLAmAdIQ3AKfPiGFUMMIaOYwAAGEVIAwBgFCENAIBRhDQAAEYR0gAAGEVIAwBgFCENAIBRhDQAAEYR0gAAGEVIAwBgFCENAIBRhDQAAEYR0gAAGEVIAwBgFCENAIBRhDQAAEbViXQBAMSufJ9PUvoOvHB9QaEkR6REQHQhpAFUmzIvTpKH516wfk9OJkcduAQ0dwMAYBQhDQCAUYQ0AABGEdIAABhFSAMAYBQhDQCAUYQ0AABGcZ00EIPS0jOk6MjRkNsYSASIHoQ0EIM0oEMNIqIYSASIHjR3AwBgFCENAIBRhDQAAEYR0gAAGEVIAwBgFCENAIBRhDQAAEZxnTQAU/J9PknpOzDktqSEeFm9akXYywRECiENwJQyL67CgVh8S7PDXh4gkmjuBgDAKEIaAACjaO4GEPXnqzlXjVhFSAOI+vPVnKtGrKK5GwAAowhpAACMIqQBADCKkAYAwChCGgAAo+jdDUSptPQMKTpyNOS2/IJCSQ57iQBUNUIaiFIa0BUNn7knJzPs5QFQ9WjuBgDAKEIaAACjCGkAAIwipAEAMIqQBgDAKHp3A6ixl6oxexasI6QB1NhL1Zg9C9bR3A0AgFFmQvrxxx+XuLg4efTRRwPrSkpKZOzYsdKyZUtp0qSJpKeny6FDh4Iet3fvXhk6dKg0atRIWrVqJZMmTZKysrIIvAMAAGIwpN9991159tlnpVu3bkHrJ0yYIKtXr5aVK1fKm2++KQcOHJBhw4YFtpeXl7uAPnPmjGzZskVeeOEFWbJkiWRnZ0fgXQAAEGMhffLkScnMzJTf/e530qJFi8D64uJiWbRokcyZM0f69+8vPXv2lMWLF7sw3rZtm9tnw4YNsmvXLlm6dKn06NFDhgwZItOnT5f58+e74AYAIJpFPKS1OVtrw6mpqUHr8/LypLS0NGh9p06dpF27drJ161Z3X2+7du0qiYmJgX0GDx4sx48fl507d1b4mqdPn3b7nLsAAGBNRHt3L1++XN5//33X3H2+gwcPSr169aR58+ZB6zWQdZt/n3MD2r/dv60is2bNkscee6yK3gUAADFWk963b5/87Gc/kxdffFEaNGgQ1tfOyspyzen+RcsCAIA1EQtpbc4+fPiwfOMb35A6deq4RTuHPf300+5nrRHreeVjx44FPU57d7du3dr9rLfn9/b23/fvE0r9+vWladOmQQsAANZELKQHDBggH330kWzfvj2wpKSkuE5k/p/r1q0rmzZtCjxm9+7d7pKrPn36uPt6q8+hYe+3ceNGF7pdunSJyPsCACDqz0lfddVVcssttwSta9y4sbsm2r9+1KhRMnHiRImPj3fBO378eBfMvXv3dtsHDRrkwnjEiBEye/Zsdx56ypQprjOa1pYBAIhmpocFnTt3rtSqVcsNYqI9srXn9jPPPBPYXrt2bVmzZo2MGTPGhbeG/MiRIyU3N/QQgAAARBNTIb158+ag+9qhTK951qUi7du3l3Xr1oWhdAAA1LDrpAEAQGiENAAARplq7gZw6XMh5xcUSjIHDIhphDQQpXMh78nJDHt5AIQXzd0AABhFSAMAYBQhDQCAUYQ0AABGEdIAABhFSAMAYBQhDQCAUVwnDSDq5ft8ktJ3YOhtDPqCKEZIA4h6ZV4cg74gJtHcDQCAUYQ0AABGEdIAABhFSAMAYBQhDQCAUYQ0AABGEdIAABhFSAMAYBQhDQCAUYQ0AABGMSwogBrrYmN+JyXEy+pVK8JeJuBchDSAGutiY377lmaHvTzA+WjuBgDAKEIaAACjCGkAAIwipAEAMIqQBgDAKEIaAACjuAQLMCAtPUOKjhy9YH1+QaEkR6REACwgpAEDNKBDXa+7JyczIuUBYAPN3QAAGEVIAwBgFCENAIBRhDQAAEYR0gAAGEVIAwBgFCENAIBRhDQAAEYR0gAAGEVIAwBgFCENAIBRhDQAAEYR0gAAGEVIAwBgFCENAIBRhDQAAEbViXQBgJoiLT1Dio4cDbktv6BQksNeIgDWEdJAmGhAJw/PDbltT04mvwcAF6C5GwAAowhpAACMIqQBADCKc9IAEEK+zycpfQdesD4pIV5Wr1rBMUNYENIAEEKZFxeyo59vaTbHC2FDczcAAEYR0gAAGEVIAwBgFCENAIBRhDQAAEYR0gAAGEVIAwBgFCENAEAshfR1110n//znPy9Yf+zYMbcNAABEKKQLCwulvLz8gvWnT5+W/fv3X/LzLFiwQLp16yZNmzZ1S58+feRPf/pTYHtJSYmMHTtWWrZsKU2aNJH09HQ5dOhQ0HPs3btXhg4dKo0aNZJWrVrJpEmTpKys7HLeFgAA0Tss6GuvvRb4+c9//rM0a9YscF9De9OmTdKhQ4dLfr5rr71WHn/8cbnhhhvE8zx54YUX5L777pMPPvhAbr75ZpkwYYKsXbtWVq5c6V5r3LhxMmzYMHn77bcDr6kB3bp1a9myZYsUFRXJgw8+KHXr1pWZM2dW5q0BABDdIX3//fe727i4OBk5cmTQNg1GDegnn3zykp8vLS0t6P6MGTNc7Xrbtm0uwBctWiTLli2T/v37u+2LFy+Wzp07u+29e/eWDRs2yK5du+T111+XxMRE6dGjh0yfPl0mT54s06ZNk3r16oV8Xa3x6+J3/PjxyhwGAADsNXefPXvWLe3atZPDhw8H7uuiobd792759re/fVkF0Vrx8uXL5dSpU67ZOy8vT0pLSyU1NTWwT6dOndxrb9261d3X265du7qA9hs8eLAL3Z07d1b4WrNmzXI1c//Stm3byyozAADmzkkXFBTI1VdfXSUF+Oijj9z55vr168sjjzwir7zyinTp0kUOHjzoasLNmzcP2l8DWbcpvT03oP3b/dsqkpWVJcXFxYFl3759VfJeAAAwMVWlnn/WxV+jPtfzzz9/yc9z0003yfbt211Y/u///q9rRn/zzTelOukXAl0AAIi5kH7sscckNzdXUlJSJCkpyZ2jvlxaW05OTnY/9+zZU9599135n//5H8nIyJAzZ864y7rOrU1r727tKKb09p133gl6Pn/vb/8+AADUqJBeuHChLFmyREaMGFHlBfKf39bA1s5oWlvXS6+UnvPWS670nLXSW+1sprV5vfxKbdy40V3OpU3mAADUuJDWGu4dd9xxxS+u54aHDBniOoOdOHHC9eTevHlz4PKuUaNGycSJEyU+Pt4F7/jx410wa89uNWjQIBfG+mVh9uzZ7jz0lClT3LXVNGcDAGpkx7GHH37YBeqV0hqwXtes56UHDBjgmro1oAcOHOi2z5071/UW15p03759XRP2yy+/HHh87dq1Zc2aNe5Ww3v48OHu+bQpHgCAGlmT1pHAnnvuOXd9so4Yps3S55ozZ84lPY9eB30xDRo0kPnz57ulIu3bt5d169ZdYskBAIjxkN6xY4cbOER9/PHHQduupBMZAAC4wpB+4403LudhAACgEpiqEgCAWKpJ9+vX76LN2n/5y1+upEwAAOByQ9p/PtpPx9jWUcP0/PT5E28AAIAwhrReGhWKzjx18uTJyywKAACotnPSep1yZcbtBgAAYQppnTpSr20GAAARau4eNmxY0H3P86SoqEjee+89mTp1ahUUCwAAXFZI67ja56pVq5Yb2lOH49TxtAEAQIRCevHixVXw0gAAoMpD2i8vL08++eQT9/PNN98st95665U8HQAAuNKQ1tmrHnjgATetZPPmzd26Y8eOuUFOli9fLgkJCZfztAAA4Ep7d+u8zjr/886dO+Xo0aNu0YFMjh8/Lj/96U8v5ykBAEBV1KTXr1/vpqns3LlzYF2XLl3clJJ0HAMAIII16bNnz14wh7TSdboNAABEKKT79+8vP/vZz+TAgQOBdfv375cJEybIgAEDqqBYAADgskL6t7/9rTv/3KFDB7n++uvd0rFjR7du3rx5HFUAACJ1Trpt27by/vvvu/PSn376qVun56dTU1OrokwAAKCyIa3zRI8bN062bdsmTZs2lYEDB7pFFRcXu2ulFy5cKHfddRcHF0BMyvf5JKXv//3fO19SQrysXrUi7GVC7KpUSD/11FMyevRoF9Chhgr98Y9/LHPmzCGkAcSsMi9OkofnhtzmW5od9vIgtlXqnPSHH34o99xzT4Xb9fIrHYUMAACEOaQPHToU8tIrvzp16siRI0eqoFgAAKBSIX3NNde4kcUqsmPHDklKSuKoAgAQ7pC+99573XzRJSUlF2z76quvJCcnR7797W9XRbkAAKjxKtVxbMqUKfLyyy/LjTfe6Hp56xzSSi/D0iFBy8vL5de//nWNP6gAAIQ9pBMTE2XLli0yZswYycrKEs/z3Pq4uDgZPHiwC2rdBwAARGAwk/bt28u6devkyy+/FJ/P54L6hhtukBYtWlRBcQAAwBWNOKY0lG+77bbLfTgAAKiOsbsBAED1I6QBADCKkAYAwChCGgAAowhpAABirXc3gAulpWdI0ZGjIQ9NfkGhJHPQAFQCIQ1UIQ3oiqYx3JOTybEGUCk0dwMAYBQhDQCAUYQ0AABGEdIAABhFSAMAYBQhDQCAUYQ0AABGEdIAABhFSAMAYBQhDQCAUYQ0AABGEdIAABhFSAMAYBQhDQCAUYQ0AABGEdIAABhFSAMAYFSdSBcAAGJFvs8nKX0HhtyWlBAvq1etCHuZEN0IaQCoImVenCQPzw25zbc0m+OMSqO5GwAAowhpAACMIqQBADCKc9LAZUhLz5CiI0cvWJ9fUCjJHFFU0eeJzmYgpIHLoP9QQ3UQ2pOTyfFElX2e6GwGmrsBADCKkAYAwChCGgAAowhpAACMimhIz5o1S2677Ta56qqrpFWrVnL//ffL7t27g/YpKSmRsWPHSsuWLaVJkyaSnp4uhw4dCtpn7969MnToUGnUqJF7nkmTJklZWVmY3w0AADEU0m+++aYL4G3btsnGjRultLRUBg0aJKdOnQrsM2HCBFm9erWsXLnS7X/gwAEZNmxYYHt5ebkL6DNnzsiWLVvkhRdekCVLlkh2NkPwAQCiW0QvwVq/fn3QfQ1XrQnn5eVJ3759pbi4WBYtWiTLli2T/v37u30WL14snTt3dsHeu3dv2bBhg+zatUtef/11SUxMlB49esj06dNl8uTJMm3aNKlXr16E3h0AADF0TlpDWcXHx7tbDWutXaempgb26dSpk7Rr1062bt3q7utt165dXUD7DR48WI4fPy47d+4M+TqnT592289dAACwxkxInz17Vh599FH55je/Kbfccotbd/DgQVcTbt68edC+Gsi6zb/PuQHt3+7fVtG58GbNmgWWtm3bVtO7AgAgBkJaz01//PHHsnz58mp/raysLFdr9y/79u2r9tcEACAqhwUdN26crFmzRt566y259tprA+tbt27tOoQdO3YsqDatvbt1m3+fd955J+j5/L2//fucr379+m4BAMCyiIa053kyfvx4eeWVV2Tz5s3SsWPHoO09e/aUunXryqZNm9ylV0ov0dJLrvr06ePu6+2MGTPk8OHDrtOZ0p7iTZs2lS5dukTgXQHAhfJ9PknpOzDkoWFiFpgMaW3i1p7br776qrtW2n8OWc8TN2zY0N2OGjVKJk6c6DqTafBqqGswa89upZdsaRiPGDFCZs+e7Z5jypQp7rmpLQOwosyLCzmJhmJiFpgM6QULFrjbu+++O2i9Xmb10EMPuZ/nzp0rtWrVcjVp7ZWtPbefeeaZwL61a9d2TeVjxoxx4d24cWMZOXKk5OaG/mMAACBaRLy5++s0aNBA5s+f75aKtG/fXtatW1fFpQMAILLM9O4GAADBCGkAAIwipAEAMIqQBgDAKEIaAACjTIw4BliUlp4hRUeOhtzG4BMAwoGQBiqgAc3gEwAiieZuAACMIqQBADCKkAYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMIqQBADCKkAYAwChCGgAAowhpAACMqhPpAgCRlJaeIUVHjobcll9QKMlhLxEA/BshjRpNAzp5eG7IbXtyMsNeHgA4F83dAAAYRUgDAGAUIQ0AgFGENAAARhHSAAAYRUgDAGAUIQ0AgFGENAAARjGYCQAYle/zSUrfgSG3JSXEy+pVK8JeJoQXIQ0ARpV5cRWOiOdbmh328iD8aO4GAMAoQhoAAKMIaQAAjCKkAQAwipAGAMAoQhoAAKMIaQAAjCKkAQAwipAGAMAoQhoAAKMYFhQxLy09Q4qOHA25Lb+gUJLDXiIAuDSENGKeBnRF4x/vyckMe3kA4FLR3A0AgFGENAAARhHSAAAYxTlpAIhC+T6fpPQdGHJbUkK8rF61IuxlQtUjpAEgCpV5cRV2iPQtzQ57eVA9aO4GAMAoQhoAAKMIaQAAjCKkAQAwipAGAMAoQhoAAKMIaQAAjCKkAQAwipAGAMAoQhoAAKMIaQAAjGLsbgCoIZNvMPFG9IloTfqtt96StLQ0adOmjcTFxckf//jHoO2e50l2drYkJSVJw4YNJTU1VT777LOgfY4ePSqZmZnStGlTad68uYwaNUpOnjwZ5ncCAPYm3zh/KTpyNNJFQzSF9KlTp6R79+4yf/78kNtnz54tTz/9tCxcuFD+9re/SePGjWXw4MFSUlIS2EcDeufOnbJx40ZZs2aNC/4f/ehHYXwXAADEYHP3kCFD3BKK1qKfeuopmTJlitx3331u3e9//3tJTEx0Ne4HHnhAPvnkE1m/fr28++67kpKS4vaZN2+e3HvvvfLf//3froYeyunTp93id/z48Wp5fwAAxGTHsYKCAjl48KBr4vZr1qyZ9OrVS7Zu3eru6602cfsDWun+tWrVcjXvisyaNcs9l39p27ZtNb8bAABiKKQ1oJXWnM+l9/3b9LZVq1ZB2+vUqSPx8fGBfULJysqS4uLiwLJv375qeQ8AAFyJGtm7u379+m4BAMAyszXp1q1bu9tDhw4Frdf7/m16e/jw4aDtZWVlrse3fx8AAKKV2ZDu2LGjC9pNmzYFdfDSc819+vRx9/X22LFjkpeXF9jnL3/5i5w9e9aduwYAIJpFtLlbr2f2+XxBncW2b9/uzim3a9dOHn30UfnNb34jN9xwgwvtqVOnuh7b999/v9u/c+fOcs8998jo0aPdZVqlpaUybtw41/O7op7diF1p6RkhrwPNLyiU5IiUCACiOKTfe+896devX+D+xIkT3e3IkSNlyZIl8stf/tJdS63XPWuN+c4773SXXDVo0CDwmBdffNEF84ABA1yv7vT0dHdtNWoeDWgdsOF8e3IyI1IeAIjqkL777rvd9dAV0VHIcnNz3VIRrXUvW7asmkoIAEDkmD0nDQBATUdIAwBgFCENAIBRhDQAAEYR0gAAGEVIAwBgFCENAIBRhDQAAEYR0gAAGEVIAwBgFCENAIBRhDQAAEZFdIINAED45Pt8ktJ3YMhtSQnxsnrVCn4dxhDSAFBDlHlxIadzVb6l2WEvD74ezd0AABhFSAMAYBQhDQCAUYQ0AABGEdIAABhFSAMAYBQhDQCAUYQ0AABGMZgJAIDRyIwipAEAjEZmFM3dAAAYRU0aUSUtPUOKjhwNuS2/oFCSw14iAKg+hDSiigZ0RRME7MnJDHt5AKA60dwNAIBRhDQAAEYR0gAAGEVIAwBgFCENAIBR9O6GOVxmBQD/h5CGOVxmBdiS7/NJSt+BIbclJcTL6lUrwl6mmoKQBgBcVJkXV+H4BL6l2Ry9asQ5aQAAjCKkAQAwiuZuRASdwwDg6xHSiAg6hwHA16O5GwAAowhpAACMIqQBADCKkAYAwCg6jgEAqnw0MkYiqxqENACgykcjYySyqkFzNwAARlGTRkQGLckvKJRkjj0AXBQhjYgMWrInJ5MjDwBfg+ZuAACMIqQBADCK5m4AQNguzVL7934h17RrH3Ibl24FI6RxxZjRCsClXprl75NS0TYu3QpGSOOKMaMVAFQPzkkDAGAUNWlcEpq0ASD8CGlcEpq0ASD8aO4GAMAoQhoAAKMIaQAAjOKcNAAgKgZBSUqIl9WrVkhNQkgjCLNWAbA6CIpvabbUNIQ0gjBrFQDYQUgDAKK6KTwphpvBCekaOPjIxQa3zy8olORqLhsAVGVTuC+Gm8FjJqTnz58vTzzxhBw8eFC6d+8u8+bNk9tvv11qqq8bfORi2wAgmuTHcGezmAjpFStWyMSJE2XhwoXSq1cveeqpp2Tw4MGye/duadWqlcRqrTjaP3wAUN2dzTZM+0GlA/xiLZHh/r8bEyE9Z84cGT16tPzwhz909zWs165dK88//7z86le/klitFV/sw0ezNQDIZfUWv1hLZLib1qM+pM+cOSN5eXmSlZUVWFerVi1JTU2VrVu3hnzM6dOn3eJXXFzsbo8fP37F5fl/wx+SQ//4MuS2xKtbyEtLl1T6OcvLyqT0q1MXrC8t96R9+uSQj9k98+GQj1He2bOV3nY5jwnn88Xqa0Vz2cP5WtFc9nC+VjSXvTpe6/M9e+TWO/pdsL7wi73SvoLH6P/jqsgKv6uuukri4uIq3sGLcvv37/f0bWzZsiVo/aRJk7zbb7895GNycnLcY1g4BnwG+AzwGeAzIBE8BsXFxRfNuKivSV8OrXXrOWy/s2fPytGjR6Vly5YX/0ZTSfptq23btrJv3z5p2rSpRBPKzjHn82JftP6dRmu5q6PsWpO+mKgP6auvvlpq164thw4dClqv91u3bh3yMfXr13fLuZo3b15tZdRfZLR9EP0oO8ecz4t90fp3Gq3lDmfZo36CjXr16knPnj1l06ZNQTVjvd+nT5+Ilg0AgCsR9TVppU3XI0eOlJSUFHdttF6CderUqUBvbwAAolFMhHRGRoYcOXJEsrOz3WAmPXr0kPXr10tiYmJEy6VN6jk5ORc0rUcDys4x5/NiX7T+nUZruSNR9jjtPRaWVwIAADXrnDQAALGKkAYAwChCGgAAowhpAACMIqSryXe+8x1p166dNGjQQJKSkmTEiBFy4MCBoH127Nghd911l9tHR7CZPXu2RFphYaGMGjVKOnbsKA0bNpTrr7/e9WTUMdLP3UdHZjt/2bZtm+lyWz3masaMGXLHHXdIo0aNKhxYJ9QxX758uURD2ffu3StDhw51++jMdJMmTZKysjKxpkOHDhcc48cff1ysTs+r5dXPss7+984774h106ZNu+D4durUSSx66623JC0tTdq0aePK+cc//jFou/a51iuK9P+7/s/R+SI+++yzKi8HIV1N+vXrJy+99JKbLnPVqlXy+eefy/e+972goeUGDRok7du3dxOE6FzY+gF+7rnnJJI+/fRTNxjMs88+Kzt37pS5c+e6WcX+67/+64J9X3/9dSkqKgosOqiM5XJbPeZKv0z8x3/8h4wZM+ai+y1evDjomN9///1ivezl5eUuoHW/LVu2yAsvvCBLlixx/+Asys3NDTrG48ePF6vT8+oX0ffff1+6d+/upuc9fPiwWHfzzTcHHd+//vWvYtGpU6fccdUvQ6HoF/ynn37a/Z/529/+Jo0bN3a/g5KSkqotSFVOdoGKvfrqq15cXJx35swZd/+ZZ57xWrRo4Z0+fTqwz+TJk72bbrrJ3GGcPXu217Fjx8D9goICNzD8Bx984Fl2frmj4ZgvXrzYa9asWchtesxfeeUVz6qKyr5u3TqvVq1a3sGDBwPrFixY4DVt2jTod2FB+/btvblz53rW6eRBY8eODdwvLy/32rRp482aNcuzTCc36t69uxdt5Ly/vbNnz3qtW7f2nnjiicC6Y8eOefXr1/f+8Ic/VOlrU5MOA52848UXX3RNgnXr1nXrdBrNvn37umFN/fRbmNa8v/wy9FSXkaJTecbHx4ds0temyzvvvFNee+01seb8ckfTMa/I2LFj3Xj1OrKezpceDcMc6HHv2rVr0OBCety1ZUNbPazR5m2dbOfWW291rS3WmuX90/Nq8+qlTs9riTYJaxPyddddJ5mZme5USLQpKChwA2ed+zto1qyZO+1Q1b8DQroaTZ482TWB6B+8fhBfffXVwDb9BZ8/Ipr/vm6zwufzybx58+THP/5xYF2TJk3kySeflJUrV8ratWtdSGuzq6WgDlXuaDnmF2uG1VMoGzdulPT0dPnJT37i3qN10XTcf/rTn7rz/G+88Yb77MycOVN++ctfiiX/+Mc/3CmEUMfU2vE8n4aYnurQESEXLFjgwk77iJw4cUKiycF/Hedw/A4I6Ur41a9+FbLzzrmLnhv1084xH3zwgWzYsMHN1PXggw9GrOZT2bKr/fv3yz333OPON44ePTqwXmtyej5M/+Buu+02V/MYPny4q3VYLne4XU7ZL2bq1KnyzW9+09Xw9Aughkd1HPPqKHskVea96Of67rvvlm7duskjjzzivozqF6HTp09H+m3EhCFDhri/Sz2+2pqybt06OXbsmPvyiRgeuztcfv7zn8tDDz100X20CefcMNPlxhtvlM6dO7vexNoDWmfn0mk0Q02vqSqaYjOcZdee6Nr5TZvoL6VjlQa21vAsl9v6Ma8sPebTp093AVLV4whXZdn12J7f87g6j3tVvhc9xtrcrVcP3HTTTRKt0/NapVcD6P9HbfmKJq3/dZz1mGvvbj+9r3NHVCVCuhISEhLccjm057HyfyPXoP71r38tpaWlgfPUGnL6j6BFixYSybJrTVSDTntra29iPd/1dbZv3x70YbVYbsvH/HLoMddyV8dA/1VZdj3uepmW9jzWPgz+465z8Xbp0kWq25W8Fz3G+jnyl9va9Lz+3v3+6XnHjRsn0eTkyZPuyhe9RDWadOzY0QW1HnN/KGsfC+3l/XVXaFRalXZDg7Nt2zZv3rx5rvdzYWGht2nTJu+OO+7wrr/+eq+kpCTQEzAxMdEbMWKE9/HHH3vLly/3GjVq5D377LMRPYp///vfveTkZG/AgAHu56KiosDit2TJEm/ZsmXeJ5984pYZM2a43rvPP/+86XJbPebqiy++cJ+Xxx57zGvSpIn7WZcTJ0647a+99pr3u9/9zvvoo4+8zz77zPVU17JnZ2ebL3tZWZl3yy23eIMGDfK2b9/urV+/3ktISPCysrI8S7Zs2eJ6dmsZP//8c2/p0qWunA8++KBnjX52tSex/i3u2rXL+9GPfuQ1b948qAe9RT//+c+9zZs3uytE3n77bS81NdW7+uqrvcOHD3vWnDhxIvBZ1qicM2eO+1k/7+rxxx93x1yv3NmxY4d33333uatJvvrqqyotByFdDfQX1q9fPy8+Pt79IXXo0MF75JFHXHic68MPP/TuvPNOt88111zjfukWLqPRD2SoxU//MXTu3NmFhF5Go5eDrFy50ny5rR5zNXLkyJBlf+ONN9z2P/3pT16PHj1cCDZu3NhdxrJw4UJ36Y31siv9sjpkyBCvYcOG7p+y/rMuLS31LMnLy/N69erlLiNr0KCB+4zPnDkz8MXaGq0ItGvXzqtXr577G9TKgXUZGRleUlKSK7P+/el9n8/nWfTGG2+E/Fzr591/GdbUqVPdF3/9f6IVhN27d1d5OZiqEgAAo+jdDQCAUYQ0AABGEdIAABhFSAMAYBQhDQCAUYQ0AABGEdIAABhFSAMAYBQhDaBSdKpBnRjhUm3evNnNNKWzHQGoHEIaqAF0InqdOWno0KGVelyHDh3kqaeeClqXkZEhe/bsueTn0BnJioqKpFmzZpcV8kBNRkgDNcCiRYtk/Pjx8tZbb7npPK9Ew4YNKzUrlM7apDMGaW0aQOUQ0kCM0+kAV6xY4abQ05q01mTPtXr1arntttukQYMGbq7i7373u2793XffLV988YVMmDDBBaw/ZM+tCWuNWtd/+umnQc85d+5cuf766y9o7taff/jDH0pxcXHgOadNmya5ublyyy23XFB2nQZw6tSp1XZsAOsIaSDGvfTSS9KpUyc3b/bw4cPl+eef16nB3La1a9e6UL733nvlgw8+cPPj3n777W7byy+/LNdee60LUG2u1uV8N954o6SkpMiLL74YtF7v/+AHPwjZ9K3N5zqXtP85f/GLX8h//ud/yieffCLvvvtuYF8tz44dO1yoAzUVIQ3UgKZuDWd1zz33uFrsm2++6e7PmDFDHnjgAXnsscekc+fO0r17d8nKynLb4uPj3Xnsq666yjVX6xJKZmam/OEPfwjc19p1Xl6eWx+q6VvPTWsN2v+cTZo0cV8GBg8eLIsXLw7sqz9/61vfkuuuu67KjwkQLQhpIIbt3r1b3nnnHfn+97/v7tepU8d1/NLgVtu3b5cBAwZc0WtoyBcWFsq2bdsCtehvfOMbrvZeGaNHj3ZhX1JSImfOnJFly5a5GjZQk9WJdAEAVB8N47KyMmnTpk1gnTZ1169fX37729+6TmBXSmvD/fv3d6Hau3dvd6vnvysrLS3NleuVV15xNe7S0lL53ve+d8XlA6IZNWkgRmk4//73v5cnn3zS1Zj9y4cffuhCW2ut3bp1c+ehK6JhWV5e/rWvpU3b2jlNL/XKz893tevKPqfW8keOHOmauXXR56iKLxFANKMmDcSoNWvWyJdffimjRo0KXKPsl56e7mrZTzzxhGvu1p7YGooa7OvWrZPJkycHrpPWy7Z0m9Zytfd3KMOGDXO1Z1369esXVHM/nz6n9jjXLwd6DrxRo0ZuUQ8//LA7N67efvvtKjwaQHSiJg3EKA3h1NTUCwLaH9Lvvfee6xy2cuVKee2119zlTtpsreew/bRnt55v1hBPSEio8LW0c5k2V2stPVSHsfN7eD/yyCPu3Lg+5+zZswPbbrjhBrddz2f36tXrst87ECviPP+1GAAQYfrvSIP6Jz/5iUycODHSxQEijuZuACYcOXJEli9fLgcPHuTaaOBfCGkAJuhQo3rO+7nnnpMWLVpEujiACYQ0ABM48wZciI5jAAAYRUgDAGAUIQ0AgFGENAAARhHSAAAYRUgDAGAUIQ0AgFGENAAAYtP/BzEBPjS9wqmLAAAAAElFTkSuQmCC",
      "text/plain": [
       "<Figure size 500x500 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df['Activity'] = -np.log10(np.exp(df.dG_bind/0.5961)/1e-6)\n",
    "sns.displot(df.Activity);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a8467530",
   "metadata": {},
   "source": [
    "### Setup for Machine Learning "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dfa0db27",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Add a fingerprint column to the dataframe. The fingerprint descriptors will be used to train the machine learning model. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "12417e8f-f9bb-4f5d-bf89-593247971303",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "0148265323434e4c9de62f4096bca350",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/9997 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "smi2fp = uru.Smi2Fp()\n",
    "df['fp'] = df.SMILES.progress_apply(smi2fp.get_np)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d399cef5",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Create a pool for fingerprints for the active learning algorithm to draw from."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "094e011b-da93-49ab-9798-f0e733069774",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 8.52 ms, sys: 7.27 ms, total: 15.8 ms\n",
      "Wall time: 12.8 ms\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "X_pool = np.stack(df.fp.values)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3e72593d",
   "metadata": {},
   "source": [
    "### Define Helper Functions for Active Learning"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1452e741",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "For the greedy search, we want to select the best scoring molecules, but we want to avoid selecting the same molecules multiple times. This function accepts a list of predictions and does the following.\n",
    "- Sort by score\n",
    "- Remove the molecules that were previously selected\n",
    "- Return the top **n_to_choose**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "cc13556d",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "def find_best_idx(predicted, used, num_to_choose):\n",
    "    tmp_list = list(enumerate(predicted))\n",
    "    tmp_list.sort(key=itemgetter(1), reverse=True)\n",
    "    tmp_list = [x for x in tmp_list if x[0] not in used]\n",
    "    tmp_list = [x[0] for x in tmp_list]\n",
    "    return tmp_list[:num_to_choose]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f1df581e",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Here are a couple of acquistion functions.  The first **greedy** simply selects the **n** top scoring molecules to be evaluated by the oracle.  The second **my_max_PI** maximizes the Probability of Improvement (PI) and uses uncertainty and the scores to balance exploration and exploitation.  The git repo associated with our paper [\"Optimizing active learning for free energy calculations\"](https://www.sciencedirect.com/science/article/pii/S2667318522000204) has examples of several other acquistion functions.  TLDR from our work, the acquistion function doesn't make a huge difference. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "241046d8",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "def greedy(optimizer: BaseLearner, X: modALinput, n_instances=1, used=[]):\n",
    "    res = optimizer.predict(X)\n",
    "    best_idx = find_best_idx(res, used, n_instances)\n",
    "    return best_idx, X[best_idx]\n",
    "\n",
    "def my_max_PI(optimizer: BaseLearner, X: modALinput, tradeoff: float = 0,\n",
    "           n_instances: int = 1, used = [], cycle = -1) -> np.ndarray:\n",
    "    pi = optimizer_PI(optimizer, X, tradeoff=tradeoff)\n",
    "    best_idx = find_best_idx(pi, used, n_instances)\n",
    "    return best_idx, X[best_idx]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53dc2b63",
   "metadata": {},
   "source": [
    "### Create an Oracle"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73b52131",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Create an oracle that will return values. This example oracle just looks up a value in a table.  In practice, you'd put code here to do a more expensive calculation.  The notebook **active_shape_search.ipynb** has a complete implementation of an oracle. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "aefb3804",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "value_column = \"Activity\"\n",
    "oracle = Oracle(df, value_column)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d12bfa14",
   "metadata": {},
   "source": [
    "### Run Active Learning"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bf69645c",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "This is the main active learning loop."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "03d3defe-55e6-4e34-bce7-a429b3ca9acb",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "64a090e87db84b59b85f633e7d171337",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/5 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/opt/homebrew/Caskroom/miniforge/base/envs/rdkit_2025_10/lib/python3.11/site-packages/sklearn/utils/deprecation.py:132: FutureWarning: 'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be removed in 1.8.\n",
      "  warnings.warn(\n",
      "/opt/homebrew/Caskroom/miniforge/base/envs/rdkit_2025_10/lib/python3.11/site-packages/sklearn/utils/deprecation.py:132: FutureWarning: 'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be removed in 1.8.\n",
      "  warnings.warn(\n",
      "/opt/homebrew/Caskroom/miniforge/base/envs/rdkit_2025_10/lib/python3.11/site-packages/sklearn/utils/deprecation.py:132: FutureWarning: 'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be removed in 1.8.\n",
      "  warnings.warn(\n",
      "/opt/homebrew/Caskroom/miniforge/base/envs/rdkit_2025_10/lib/python3.11/site-packages/sklearn/utils/deprecation.py:132: FutureWarning: 'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be removed in 1.8.\n",
      "  warnings.warn(\n",
      "/opt/homebrew/Caskroom/miniforge/base/envs/rdkit_2025_10/lib/python3.11/site-packages/sklearn/utils/deprecation.py:132: FutureWarning: 'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be removed in 1.8.\n",
      "  warnings.warn(\n"
     ]
    }
   ],
   "source": [
    "# number of molecules to select at each active learning cycle\n",
    "n_instances = 50\n",
    "# number of active learning cycles to run\n",
    "n_cycles = 5\n",
    "\n",
    "# define the acquistion function, to change to greedy just change the line below\n",
    "query_strategy = my_max_PI\n",
    "# select an initial random cycle\n",
    "X_initial, y_initial, sample_idx = oracle.sample(n_instances)\n",
    "# instantiate the optimizer with an estimator, training data, and an acquistion function\n",
    "optimizer = BayesianOptimizer(estimator=GaussianProcessRegressor(kernel=TanimotoKernel()),\n",
    "                              X_training=np.stack(X_initial), y_training=y_initial,\n",
    "                              query_strategy=query_strategy)\n",
    "# initialize a list of scores\n",
    "val_list = [y_initial]\n",
    "# keep track of which molecules we've sampled\n",
    "used = list(sample_idx)\n",
    "# the active learning loop\n",
    "for i in tqdm(range(0, n_cycles)):\n",
    "    # ask the optimizer for the next set of molecules\n",
    "    query_idx, query_desc = optimizer.query(X_pool, n_instances=n_instances, used=used)\n",
    "    # get values from the oracle, in practice, this is where we would do the more expensive calculations\n",
    "    vals = oracle.get_values(query_idx)\n",
    "    # add the returned values to val_list\n",
    "    val_list.append(vals)\n",
    "    # keep track of the molecules we've used\n",
    "    used += query_idx\n",
    "    # update the optimizer with the new values\n",
    "    optimizer.teach(query_desc, vals)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eda89042",
   "metadata": {},
   "source": [
    "### Analyze Active Learning Results"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2ff0b2aa",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Let's see how many of the top 100 molecules we found."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "054caae0",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "56"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# sort the initial dataframe by score\n",
    "ref_df = df.sort_values(value_column, ascending=False).head(100).copy()\n",
    "# create a new dataframe with the selected molecules and sort by score\n",
    "pick_df = df.iloc[used].sort_values(value_column, ascending=False).head(100).copy()\n",
    "# merge the two dataframes to see how many molecules are in common\n",
    "len(ref_df.merge(pick_df, on=\"ligand_id\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "197498dd",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Let's look at how the scores were distributed across the active learning cycles. First we need to put the data in a dataframe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "1904073c",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>SMILES</th>\n",
       "      <th>cycle</th>\n",
       "      <th>mol_idx</th>\n",
       "      <th>Activity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>NC(=O)c1cccc(Nc2cc(NC(=O)c3cc(F)ccc3F)ccn2)n1</td>\n",
       "      <td>0</td>\n",
       "      <td>9103</td>\n",
       "      <td>-3.617610</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Cc1c(F)cccc1C(=O)Nc1cc(NC(=O)C2CC2(C)C)ncc1F</td>\n",
       "      <td>0</td>\n",
       "      <td>4677</td>\n",
       "      <td>-13.097629</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Cc1cc(Nc2cc(NC(=O)c3ccccc3)ccn2)nc(N2CCC2)n1</td>\n",
       "      <td>0</td>\n",
       "      <td>6704</td>\n",
       "      <td>-5.650291</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>COc1nccc(Nc2cc(NC(=O)c3c(Cl)cccc3OC)c(F)cn2)n1</td>\n",
       "      <td>0</td>\n",
       "      <td>8464</td>\n",
       "      <td>-2.938592</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Cc1ccc(C(=O)Nc2cc(Nc3ncccn3)ncc2F)c(Cl)c1</td>\n",
       "      <td>0</td>\n",
       "      <td>9461</td>\n",
       "      <td>-12.269985</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>295</th>\n",
       "      <td>Cc1cc(Nc2cc(NC(=O)c3c(F)cccc3F)ccn2)nc(NCC2CC2)n1</td>\n",
       "      <td>5</td>\n",
       "      <td>9764</td>\n",
       "      <td>2.480436</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>296</th>\n",
       "      <td>COc1ccc(Cl)c(C(=O)Nc2cc(NC(=O)NC3CC(O)C3)ncc2F)c1</td>\n",
       "      <td>5</td>\n",
       "      <td>9190</td>\n",
       "      <td>1.592321</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>297</th>\n",
       "      <td>CC(C)c1cc(Nc2cc(NC(=O)c3c(F)ccc(F)c3Cl)ccn2)nc...</td>\n",
       "      <td>5</td>\n",
       "      <td>8651</td>\n",
       "      <td>1.785390</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>298</th>\n",
       "      <td>Cc1cc(Nc2cc(NC(=O)c3cc(Cl)ccc3C)ccn2)nc(N2CC(O...</td>\n",
       "      <td>5</td>\n",
       "      <td>9784</td>\n",
       "      <td>-8.573273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>299</th>\n",
       "      <td>O=C(Nc1cc(Nc2ccncc2)ncc1F)c1c(F)cccc1Cl</td>\n",
       "      <td>5</td>\n",
       "      <td>2195</td>\n",
       "      <td>1.418924</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>300 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                SMILES  cycle  mol_idx  \\\n",
       "0        NC(=O)c1cccc(Nc2cc(NC(=O)c3cc(F)ccc3F)ccn2)n1      0     9103   \n",
       "1         Cc1c(F)cccc1C(=O)Nc1cc(NC(=O)C2CC2(C)C)ncc1F      0     4677   \n",
       "2         Cc1cc(Nc2cc(NC(=O)c3ccccc3)ccn2)nc(N2CCC2)n1      0     6704   \n",
       "3       COc1nccc(Nc2cc(NC(=O)c3c(Cl)cccc3OC)c(F)cn2)n1      0     8464   \n",
       "4            Cc1ccc(C(=O)Nc2cc(Nc3ncccn3)ncc2F)c(Cl)c1      0     9461   \n",
       "..                                                 ...    ...      ...   \n",
       "295  Cc1cc(Nc2cc(NC(=O)c3c(F)cccc3F)ccn2)nc(NCC2CC2)n1      5     9764   \n",
       "296  COc1ccc(Cl)c(C(=O)Nc2cc(NC(=O)NC3CC(O)C3)ncc2F)c1      5     9190   \n",
       "297  CC(C)c1cc(Nc2cc(NC(=O)c3c(F)ccc(F)c3Cl)ccn2)nc...      5     8651   \n",
       "298  Cc1cc(Nc2cc(NC(=O)c3cc(Cl)ccc3C)ccn2)nc(N2CC(O...      5     9784   \n",
       "299            O=C(Nc1cc(Nc2ccncc2)ncc1F)c1c(F)cccc1Cl      5     2195   \n",
       "\n",
       "      Activity  \n",
       "0    -3.617610  \n",
       "1   -13.097629  \n",
       "2    -5.650291  \n",
       "3    -2.938592  \n",
       "4   -12.269985  \n",
       "..         ...  \n",
       "295   2.480436  \n",
       "296   1.592321  \n",
       "297   1.785390  \n",
       "298  -8.573273  \n",
       "299   1.418924  \n",
       "\n",
       "[300 rows x 4 columns]"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "res_list = []\n",
    "for idx, v in enumerate(val_list):\n",
    "    res_list += [(idx, x) for x in v]\n",
    "res_df = pd.DataFrame(res_list, columns=[\"cycle\", value_column])\n",
    "# add row numbers for the selected molecules\n",
    "res_df['mol_idx'] = used\n",
    "# add the SMILES for the selected molecules \n",
    "res_df['SMILES'] = df.SMILES.values[used]\n",
    "# reorder the columns in res_df\n",
    "res_df = res_df[['SMILES','cycle','mol_idx','Activity']]\n",
    "res_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d68ce763",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Plot the scores of the molecules selected in each active learning round.  Remember that the first active learning cycle was randomly selected. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "9366f78c",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAj4AAAGwCAYAAACpYG+ZAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjcsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvTLEjVAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAMUtJREFUeJzt3Qt0VNW9x/E/gSRAgBBMIBKDgEAoIIgoCIoNSkHkukr1AlJREC+1GLUmsFpSKw9fUWl5qFFqq2BLBdKHtGpRUUDbCwpiUaMYA4KAQAwK4dUmgeSu/76daSaZhDwmOXPO/n7WOmtyHjPsnCTMb/azWXl5ebkAAABYIMLpAgAAADQVgg8AALAGwQcAAFiD4AMAAKxB8AEAANYg+AAAAGsQfAAAgDVaOF2AcFNWViYHDhyQtm3bSrNmzZwuDgAAqAWdlvD48ePSuXNniYiovl6H4FOJhp7k5OTa3GMAABBm9u3bJ+edd1615wk+lWhNj+/GtWvXrnF/OgAAICSOHTtmKi587+PVIfhU4mve0tBD8AEAwF3O1k2Fzs0AAMAaBB8AAGANgg8AALAGwQcAAFiD4AMAAKxB8AEAANYg+AAAAGsQfAAAgDUIPgAAwBquCj5vv/22XHfddWYBMp2Zcc2aNVUWKJszZ46ce+650qpVKxk5cqTk5+c7Vl6E/4K0n3/+uXzwwQfmUfcBAN7mqiUrTp48KQMGDJBp06bJ9ddfX+X8Y489Jo8//rg8//zz0q1bN7nvvvtk9OjR8sknn0jLli0dKTPCU25urqxdu1aOHDniPxYXFydjxoyRfv36OVo2AEDjaVau1SQupDU+L774oowbN87s67ehNUEzZ86UWbNmmWNFRUXSqVMnWb58udx44421XuQsNjbWPJe1urwbelauXCkpKSmSmppqfkcKCgpk48aNkpeXJ5MmTSL8AIDL1Pb921VNXTXZvXu3HDp0yDRv+egNGDJkiGzevLna5xUXF5ubVXGDd2lzltb0aOiZPHmydOnSRaKjo82j7utxPU+zFwB4k2eCj4YepZ/eK9J937lgsrKyTEDybbqkPbxrz549pnlLa3oiIgJ//XVfj+t5vQ4A4D2u6uPTGDIzMyUjI8O/rzU+hB93KCkpkcLCwjo954svvjCPZ86ckS+//LLKeT3uu05rgmorISFBoqKi6lQWAEDT80zwSUxMNI/aV0NHdfno/kUXXVTt8/TNrS5vcAgfGnqys7Pr9dxf/epXNZ5ft26d2WorLS1NkpKS6lUWAEDT8Uzw0VFcGn7efPNNf9DR2pt3331XZsyY4XTx0Ai0lkUDR11o350VK1bIOeecI9dee60cPnxYcnJyZMKECRIfHy9//etf5ZtvvpGbbrqpSlPY2coCAAh/rgo+J06ckJ07dwZ0aN6+fbt06NDBdE6955575MEHH5SePXv6h7PrSC/fyC94izYt1aeWReeC0lFd69evl759+/qbuHRfm7h0VBfNnQDgTa4azq7DjUeMGFHl+JQpU8yQdf1W5s6dK88884wcPXpUrrjiCnnqqaekV69etf43GM5uB+bxaXh/qcZCfykgfJSVlZnBHsePH5e2bdtK165d61Qb3pRq+/7tquDTFAg+dv1Bv/fee2YGcK0VvOSSS8L2D7qxaUfv+vaXCjX6SwHhIddlE73W9v3bVU1dQChpyPE1lemjraGnvv2lKtLaIl9fqYb2d6K/FBBeE71OnDgxYKJXPe7miV4JPgDq3V8qWGhhdBvgrYleI/79odA30asOENHzffr0ceUHRveVGAAANJo9Hp/oleADAAD8tCNzsJUQfHzHfde5DcEHAAD46egtpX16gvEd913nNgQfAADgp0PWdfSWdmSuvGCz7utxPa/XuRHBBwAABPTj0SHreXl5piPz3r17pbi42Dzqvh7X827s2KwY1QUAAALoUHUdsq6jt5YuXeo/rjU9bh7Krgg+AACgCg03OmTdLTM31xbBBwAABKUhp3v37uIl7o5tAAAAdUDwAQAA1iD4AAAAaxB8AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg+ADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAaBB8AAGANgg8AALAGwQcAAFiD4AMAAKzRwukCAIBXlJSUSGFhoYSDhIQEiYqKcroYQNgh+ABAiGjoyc7ODov7mZaWJklJSU4XAwg7BB8ACGEtiwaOhoannJwcmTBhgnm9hpQFQFUEHwAIEW1aClUtiwYXamyA0KNzMwAAsAY1PgAAIKiysjLZs2ePHD9+XNq2bStdu3aViAh315kQfAAAQBW5ubmydu1aOXLkiP9YXFycjBkzRvr16yduRfABAABVQs/KlSslJSVFJk6cKJ06dZKCggLZuHGjOT5p0iTXhh9311cBAICQN2+tXbvWhJ7JkydLly5dJDo62jzqvh7X83qdGxF8AACAn/bp0eat1NTUKv15dF+P63m9zo0IPgAAwE87Mitt3grGd9x3ndsQfAAAgJ+O3lLapycY33HfdW5D8AEAAH46ZF1Hb2lH5sr9eHRfj+t5vc6NCD4AACCgH48OWc/Ly5MVK1bI3r17pbi42Dzqvh7X826dz4fh7AAAIIAOVdch6zp6a+nSpf7jWtPj5qHsiuADAACq0HDTp08fZm4GAAB2iIiIkO7du4uXuLOBDgAAoB4IPgAAwBoEHwAAYA1PBZ958+ZJs2bNArbevXs7XSwAABAmPDeqq2/fvvLGG2/491u08Ny3CACeV1JSIoWFhRIOEhISJCoqyuliIEQ8lwo06CQmJjpdDABAA2joyc7ODot7mJaWJklJSU4XAyHiueCTn58vnTt3lpYtW8rQoUMlKytLunTpUu31Ohulbj7Hjh1ropICAGqqZdHA0dDwlJOTIxMmTDCvV18NeS7Cj6eCz5AhQ2T58uWSkpIiBw8elPnz58vw4cMlNze32sXUNBjpdYCbHT16VE6ePOnYv+9rknC6aSImJkbat2/vaBkQGtq0FKpaFg0u1NjAk8FH1w7x6d+/vwlC559/vkn8t912W9DnZGZmSkZGRkCNT3JycpOUFwhV6Fm4aJGcLi11/Ibq35qTWkRGSkZ6OuEHgB3BpzL95NerVy/ZuXNntddER0ebDXArrenR0NNz2AhpHRvndHEcc6roiORv2mDuB7U+AKwMPidOnJBdu3bJzTff7HRRgEanoadNh3juNADYMo/PrFmz5K233jILqm3atEm+973vSfPmzc1KsgAAAJ6q8dm/f78JOV9//bXpzHbFFVfIO++8Q498AADgveCzatUqp4sAAADCmKeCDwAACMQs2IEIPgAAeBizYAci+AAA4GENnQW7MEQzYPvK4jSCDwAAHhaqWbATPDIDtqeGswMAANSE4AMAAKxBU1cTo3c9AADOIfg0MXrXAwDgHIKPy3rXh7KHfTj0rgcAoCkRfFzau95LPewBAGgqdG4GAADWoMYHACo4evSonDx50rF7ok3ZFR+dEhMTI+3bt3e0DEBjIPgAQIXQs3DRIjldWur4PdF+fE5qERkpGenphB94DsEHAP5Na3o09PQcNkJax8ZZe19OFR2R/E0bzP2g1gdeQ/ABPPRmZbNQfv8aetp0iA/Z6wEIHwQfWNuXwmv9KfQTOgCgZgQfiO19KbzSn4Lmmf9vngGAmhB84Aj6UoS+PwXNMwBwdgQfOIo3awBAU2ICQwAAYA2CDwAAsAbBBwAAWIPgAwAArEHwAQAA1iD4AAAAaxB8AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg7W6AAAhd/ToUbPwrpMKCwsDHp0SExPToAWIEVoEHwBAyEPPwkWL5HRpaVjc2ZycHEf//RaRkZKRnk74CRMEHwBASGlNj4aensNGSOvYOKvv7qmiI5K/aYO5J9T6hAeCDwCgUWjoadMhnruLsELnZgAAYA2CDwAAsAbBBwAAWIPgAwAArEHwAQAA1iD4AAAAazCcHQCCzL1iM9u//3Dj9CzYhR6bAZvgAwCV6IRzQDgIp1mwczwyAzbBBwAqsX3GYd9sw3Aes2CHfgZsgg8AVMKMwwg3/E6GDp2bAQCANQg+AADAGgQfAABgDU8Gn+zsbOnatau0bNlShgwZIlu2bHG6SAAAIAx4LvisXr1aMjIyZO7cufL+++/LgAEDZPTo0fLVV185XTQAAOAwzwWfhQsXyvTp0+XWW2+VPn36yNKlS6V169by3HPPBb2+uLhYjh07FrABAABv8lTwKSkpkW3btsnIkSP9xyIiIsz+5s2bgz4nKytLYmNj/VtycnITlhgAADQlTwWfw4cPy5kzZ6RTp04Bx3X/0KFDQZ+TmZkpRUVF/m3fvn1NVFoAANDUrJ/AMDo62mwAAMD7PFXjEx8fL82bN5eCgoKA47qfmJjoWLkAAEB48FSNT1RUlAwaNEjefPNNGTdunDlWVlZm9u+8806niwc0KttX1Lb9+wdgYfBROpR9ypQpcskll8jgwYNl8eLFZlEzHeUFeFFMTIxZtZhFJf9/9Wa9HwBgTfCZOHGiFBYWypw5c0yH5osuukheffXVKh2eER74lN7we6ArFWekp5uA7xT9m8vJyZEJEyZIQkKCY+XQ0NPQlZsROvx9cw/CkeeCj9JmLZq23IFaitDQN/tweMPX0JOUlOR0MRAm+PsOHdtD5KkQfv+eDD5wj57DRkjr2Dix/Q+aNwh4EX/fofv75v+I0CH4wFEaetp0iOenAHgQf9+hY3uIPBXCD4gEHwAAwhwhMnQ8NY8PAABATQg+AADAGgQfAABgDYIPAACwBp2bAaAS5kyxe84YeBvBBwD+jeU/Krw5sPwHPIrgAwD/xvIf/8HyH/Aqgg8cZXuTguIehBeW/wC8jeADR9CkEIhmBQBoGgQfWNukoFhVHADsQvCB2N6koFhVHADswDw+AADAGgQfAABgDYIPAACwBsEHAABYg87NAACEOdvn+zoVwu+f4AMAQJhizrPQz3dG8AEAIEyFw5xnhYWFkpOTIxMmTDBTf7h9GRWCDwAAYSxc5jxLSEiQpKQkcTs6NwMAAGvUOfjMnTtXvvjii8YpDQAAQDgFnz//+c9ywQUXyNVXXy0vvPCCFBcXN07JAAAAnA4+27dvl61bt0rfvn3lRz/6kSQmJsqMGTPMMQAAAM/18Rk4cKA8/vjjcuDAAXn22Wdl//79cvnll0v//v1lyZIlUlRUFPqSAgAAONm5uby8XEpLS6WkpMR8HRcXJ08++aQkJyfL6tWrG1o2AAAA54PPtm3b5M4775Rzzz1X0tPTTQ3Qjh075K233pL8/Hx56KGH5O677w5tSQEAAJo6+Fx44YVy2WWXye7du00z1759++SRRx6RHj16+K+ZNGmSmfAIAAAgnNR5AkOduXHatGk1TmIUHx8vZWVlDS0bAACAs8HH15ensn/+85+yYMECmTNnTqjKBgBwMdsX1lTcAw8En/nz58sPf/hDad26dcDxU6dOmXMEHwCwGwtrNs7imnCwxqdZs2ZVjn/wwQfSoUOHEBULAOBW4bCwphcX10QTBx9t3tLAo1uvXr0Cws+ZM2fkxIkTpiYIAIBwWVjTS4troomDz+LFi01tj3Zs1iat2NhY/7moqCjp2rWrDB06NETFAgAAcDD4TJkyxTx269ZNhg0bJpGRkY1QHAAAAIeDz7Fjx6Rdu3bma52sUEdw6RaM7zoAAABXBh/t33Pw4EHp2LGjabMN1rnZ1+lZ+/sAAAC4NvisX7/eP2JLvw4WfAAAADwRfL797W/7v05NTW3M8gAAAITPWl09e/aUefPmmcVIAQAAPB187rjjDnnllVekd+/ecumll8qSJUvk0KFDjVM6AAAAJ4NPenq6bN26VXbs2CHXXnutZGdnS3JysowaNUp+85vfhLJsAAAAzgYfH529WScy/Oyzz+Rvf/ubmRr81ltvDW3pAAAAnFyrq6ItW7bICy+8IKtXrzZz/YwfPz50JQMAAHA6+GgNz+9+9ztZuXKl7N69W6666ip59NFH5frrr5c2bdqEunwAAADONXVpp+ZXX31V0tLSZP/+/fLaa6/JLbfcEhahR9cL8y2k6tseeeQRp4sFAADcWuOTl5dnhrSHq/vvv1+mT5/u32/btq2j5QEAAC4OPuEcenxBJzExsdbXFxcXm81H+yoBAACLm7p0uYrDhw/71+3S/eo2p2nT1jnnnGMWU12wYIGcPn26xuuzsrIkNjbWv+nQfAAAYHGNz6JFi/xNRvp1uK7Vdffdd8vFF19sAtimTZskMzPTLK66cOHCap+j12RkZATU+BB+AACwOPhMmTLF//XUqVOlKc2ePduMGquJTqaona4rBpj+/ftLVFSU3H777aZWJzo6Ouhz9Xh15wAAgOV9fJo3b25qUTp27Bhw/OuvvzbHzpw5E8ryycyZM88atrp37x70+JAhQ0xT1549eyQlJSWk5QIAABYEn/Ly8qDHtYOw1rCEWkJCgtnqY/v27RIREVElpAEAADvVOvg8/vjj5lH79/z6178OmLdHa3nefvtt09zklM2bN8u7774rI0aMMP2RdF/XFZs8ebLpkA0AAFDr4KOdmn01PkuXLjVNXj5a06OTB+pxp2g/nVWrVsm8efNM7VO3bt1M8KnY7wcAANit1sFHl6dQWqPypz/9KexqUXQ01zvvvON0MQAAgJf6+GzYsEFsd/ToUTl58qRj/35hYWHAo1NiYmKkffv2jpYBAIBGDT433HCDDB48WH7yk58EHH/sscdk69at8vvf/168HnoWLlokp0tLnS6K5OTkOPrvt4iMlIz0dMIPAMC7wUc7MWs/msrGjBkjv/jFL8TrtKZHQ0/PYSOkdWx4Nfc1pVNFRyR/0wZzP6j1AQB4NvicOHEi6LD1yMhIq9a50tDTpkO808UAAAChXqurogsvvFBWr15d5biOqOrTp09dXw4AACB8a3zuu+8+uf7662XXrl1y1VVXmWNvvvmmvPDCC/KHP/yhMcoIAK5QUlLS4EEHoRq8oBO/NsaksoB1wee6666TNWvWyMMPP2yCTqtWrWTAgAGyfv36sFidHQCcomElOzs7LAYvpKWlSVJSUkjKAlgdfNTYsWPNprRfz8qVK2XWrFmybdu2kK/VBQBuobUsGjjCQX2X+gG8rl7Bxze669lnn5U//vGP0rlzZ9P8FapPOgDgRtq0RC0L4KHgc+jQIVm+fLkJPFrTM2HCBLM8hDZ90bEZAAB4ZlSX9u1JSUmRDz/8UBYvXiwHDhyQJ554onFLBwAA4ESNz9q1a+Xuu++WGTNmSM+ePUNZBgAAgPAKPn//+99NE9egQYPkW9/6ltx8881y4403Nm7pALhiGHYo149jGDYQWvx91zP4XHbZZWbTZi6dwPC5556TjIwMKSsrk3Xr1klycrK0bdu2ti8HwIPDsEOxfhzDsIHQ4u+7gaO6dEXuadOmmS0vL8/UAj3yyCMye/Zs+c53viN/+ctf6vqSABzGMGzAu/j7DtFwdqWdnXVV9qysLHnppZdMLRAA92EYNuBd/H03cK2uYJo3by7jxo2jtgcAAHg/+AAAALgBwQcAAFiD4AMAAKxB8AEAANYg+AAAAGsQfAAAgDUaNI8P4OZp2EO51ALLLACAOxB8ILZPwx6KpRZYZgGAF5WVlcmePXvk+PHjZlmqrl27SkSEuxuLCD5wLaZhB4DGk5ubK2vXrpUjR474j8XFxcmYMWOkX79+rr31BB+4FtOwA0DjhZ6VK1eapakmTpwonTp1koKCAtm4caM5PmnSJNeGH3fXVwEAgJA3b61du9aEnsmTJ0uXLl0kOjraPOq+Htfzep0bEXwAAICf9unR5q3U1NQq/Xl0X4/reb3OjWjqAgAAftqRWWnzVrDOzXq84nVuQ/ABAAB+GnDU5s2bZevWrVU6N1966aUB17kNwQcAAPhprU5MTIy8/vrrpj/P8OHDpUWLFnL69GnJy8szx/W8XudGBB8AABDUrl27TNjx0QDkdnRuBgAAftqn5+TJk+brZs2a/edEhX09T+dmAADgekVFReaxV69eZvj63r17/Z2bdUj7b3/7W8nPz/df5zbur7MCAAAhc/LftT19+/Y1TVvdu3cPOK/HNfj4rnMbmroAAICfdlxWH3/8cZVJCnX/k08+CbjObQg+AADALzY21jx+9tlnsmLFCtPUVVxcbB51X49XvM5taOoCAAB+Okxd5+tp3bq1HDp0SJYuXeo/p8eTkpLk1KlTDGcHAADuFxERYVZg18VItYOzzuMTGRkppaWlprZHN12ktPJyFm5BjQ8AAAigK69ruNHFSCvO46M1Pm5emV0RfGCtYGvQuPUTDACEWr9+/aRPnz6e+3+S4AMr5ebmmk8yldeg0epdN3+ScQIBEvCuiIiIKsPZ3Y7gAytDj7Zd6xo0EydONCsNFxQUyMaNG81xt1fjNiUCJOBtZR6sGSf4wLo/Yq3p0dCjM5L6/oB1NlLd16Gael6rd93+x93YCJCAt+V6tGac4AOr6CcX/SPWmp7KwUb3U1NTzdBNvc5r1buhRIBEYyspKZHCwsIGvYbv+Q19nYSEBImKihKb5Hq4ZpzgA6toda3SP+JgfMd91yE4AiQam4aV7OzskLxWTk5Og56flpZm5q6xRZnHa8ZdE3weeugheeWVV2T79u0meR89erTKNTqr5IwZM2TDhg3Spk0bmTJlimRlZZm1RgClbdRKP7noH3FlerzidQiOAInGprUsGjjCpSw22ePxmvEWbqr2HD9+vAwdOlSeffbZKufPnDkjY8eOlcTERNm0aZMcPHhQbrnlFjPp0sMPP+xImRG+M5JqdW3FTzK+Tzl6XM/rdageARKNTT/g2lTLEk6Oe7xm3DV1VPPnz5f09HS58MILg55//fXXzcJpWgV30UUXmc5XDzzwgKkq1dBUHV1/5NixYwEbvD8jqU7IFWwNGj2u591YfetUgAy2iCEBEvDGB5tg3F4z7pn/3Tdv3mxCUcWEOnr0aBNkdIXZ6mhTmC605tuSk5ObqMRwekZS/ePV6loN1fqo+27usNeUCJCAd3X1+Acb1zR1nY0upFa5Ws63r+eqk5mZKRkZGf59DUqEH+/z6oykTk1pX3kRQwIk4I21ulasWGH69FQc1aU146zVVU+zZ8+WRx99tMZrduzYIb1795bGEh0dbTbYx4szkjY1AiTgTf08/MHG0RqfmTNnytSpU2u8prZvTNqpecuWLUHbIfUcgMZBgAS8qZ9Ha8ZbOD1EMFTDBHW0lw55/+qrr6Rjx47m2Lp166Rdu3bmBwcAAOrGix9sXNPHR0fdfPPNN+ZRh67rfD6qR48eZs6eUaNGmYBz8803y2OPPWb69fzsZz8z80DQlAUAAFwVfObMmSPPP/+8f3/gwIHmUScr1I5XzZs3l5dfftlMYKi1PzExMWYCw/vvv79RynOq6D9rl9jI9u8fAOBOrgk+y5cvN1tNzj//fPnrX//aJOXJ37ShSf4dAABgYfAJNz2HjZDWsXFic40P4Q8A4DYEn3rS0NOmQ3xofxoAAKBRuXtMGgAAQB0QfAAAgDUIPgAAwBoEHwAAYA2CDwAAsAbBBwAAWIPgAwAArEHwAQAA1iD4AAAAaxB8AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg+ADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAaBB8AAGANgg8AALAGwQcAAFiD4AMAAKxB8AEAANYg+AAAAGsQfAAAgDUIPgAAwBoEHwAAYA2CDwAAsEYLpwvgVqeKjojNbP/+AQDuRPCpo5iYGGkRGSn5mzaI7fQ+6P0AAMAtCD511L59e8lIT5eTJ0+KUwoLCyUnJ0cmTJggCQkJjpVDQ4/eD9itrKxM9uzZI8ePH5e2bdtK165dJSKCVnQA4YngUw/6Zh8Ob/gaepKSkpwuBiyWm5sra9eulSNH/tP0GRcXJ2PGjJF+/fo5WjYACIbgA6BeNPSsXLlSUlJSZOLEidKpUycpKCiQjRs3muOTJk0i/AAIO9RHA6hX85bW9GjomTx5snTp0kWio6PNo+7rcT2v1wFAOCH4AKgz7dOjzVupqalV+vPovh7X83odAIQTgg+AOtOOzEqbt4LxHfddBwDhguADoM509JbSPj3B+I77rgOAcEHwAVBnOmRdR29pR+bK/Xh0X4/reb0OAMIJwQdA3f/jiIgwQ9bz8vJkxYoVsnfvXikuLjaPuq/H9Tzz+QAINwxnB1AvOk+PDlnX0VtLly71H9eaHoayAwhXBB8ADQo/ffr0YeZmAK5B8AHQINqc1b17d+4iAFcg+ABoENbqAuAmBB8A9cZaXQDcxjWjuh566CEZNmyYtG7dutoFQps1a1ZlW7VqVZOXFbBprS6drPCHP/yhzJ071zzqvh7X8wAQblwTfEpKSmT8+PEyY8aMGq9btmyZHDx40L+NGzeuycoI2IK1ugC4lWuauubPn28ely9fXuN1WhuUmJhY69fVuUd08zl27FgDSgnYtVaXrspe3VpdOsRdr6PjM4Bw4poan9pKS0uT+Ph4GTx4sDz33HNSXl5e4/VZWVkSGxvr35KTk5usrIBbsVYXALfyVPC5//77JScnR9atWyc33HCD3HHHHfLEE0/U+JzMzEwpKiryb/v27Wuy8gJuxVpdANzK0eAze/bsoB2SK26ffvpprV/vvvvuk8svv1wGDhwoP/nJT+THP/6xLFiwoMbnREdHS7t27QI2ADVjrS4AbuVoH5+ZM2fK1KlTa7ymIf0DhgwZIg888IDpw6MBB0Bo1+rS0Vu6Npf26dHRXLoquy5Qqmt16bIVrNUFINw4GnwSEhLM1li2b99u1g0i9AChx1pdANzINaO6dNXnb775xjyeOXPGhBrVo0cPadOmjbz00kvm0+Zll10mLVu2NP18Hn74YZk1a5bTRQc8i7W6ALiNa4LPnDlz5Pnnn/fvaz8etWHDBlPNHhkZKdnZ2ZKenm5GcmkgWrhwoUyfPt3BUgPex1pdANzENcFH5++paQ6fa665xmwAAABWDGcHAACoCcEHAABYg+ADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAaBB8AAGANgg8AALAGwQcAAFiD4AMAAKxB8AEAANYg+AAAAGsQfAAAgDUIPgAAwBoEHwAAYA2CDwAAsAbBBwAAWIPgAwAArEHwAQAA1iD4AAAAaxB8AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg+ADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAaBB8AAGANgg8AALAGwQcAAFiD4AMAAKxB8AEAANYg+AAAAGsQfAAAgDUIPgAAwBotnC6AbUpKSqSwsLBBr+F7fkNfJyEhQaKiohr0GgAAuAnBp4lpWMnOzg7Ja+Xk5DTo+WlpaZKUlBSSsgAA4AYEnyamtSwaOMKlLAAA2ITg08S0aYlaFgAAnEHnZgAAYA2CDwAAsAbBBwAAWIPgAwAArOGK4LNnzx657bbbpFu3btKqVSu54IILZO7cuWZOnIo+/PBDGT58uLRs2VKSk5Plsccec6zMAAAg/LhiVNenn34qZWVl8stf/lJ69Oghubm5Mn36dDl58qT8/Oc/N9ccO3ZMRo0aJSNHjpSlS5fKRx99JNOmTZP27dvLD37wA6e/BQAAEAaalZeXl4sLLViwQJ5++mn5/PPPzb5+fe+998qhQ4f8sxHPnj1b1qxZY4JTdYqLi83mowFKa4uKioqkXbt2TfCdAACAhtL379jY2LO+f7uiqSsY/cY6dOjg39+8ebNceeWVAUswjB49WvLy8uTIkSPVvk5WVpa5Ub5NQw8AAPAmVwafnTt3yhNPPCG33367/5jW9HTq1CngOt++nqtOZmamCVG+bd++fY1YcgAAYG3w0aaoZs2a1bhVbqb68ssv5ZprrpHx48ebfj4NFR0dbarEKm4AAMCbHO3cPHPmTJk6dWqN13Tv3t3/9YEDB2TEiBEybNgweeaZZwKuS0xMlIKCgoBjvn09BwAA0MLpRTJru1Cm1vRo6Bk0aJAsW7ZMIiICK6uGDh1qOjeXlpZKZGSkObZu3TpJSUmRuLi4Rik/AABwF1f08dHQk5qaKl26dDHD1wsLC02/nYp9d77//e+bjs0638/HH38sq1evliVLlkhGRoajZQcAAOHDFfP4aM2NdmjW7bzzzgs45xuNryOyXn/9dUlLSzO1QvHx8TJnzpw6z+Hjez0dFgcAANzB9759tll6XDuPT2PZv38/Q9oBAHApHZ1duZKkIoJPJTpDtHaibtu2rRlVFo58kyzqD5dRaNzLcMDvJPcyHPF7add9LC8vl+PHj0vnzp2r9AN2XVNXU9KbVVNSDCcMv+dehht+J7mX4YjfS3vuY2xsrDc6NwMAAIQCwQcAAFiD4ONCOtv03LlzzSO4l+GA30nuZTji95L7GAydmwEAgDWo8QEAANYg+AAAAGsQfAAAgDUIPgAAwBoEH5fJzs6Wrl27SsuWLWXIkCGyZcsWp4vkSm+//bZcd911ZoZPnaF7zZo1ThfJlbKysuTSSy81M5137NhRxo0bJ3l5eU4Xy5Wefvpp6d+/v3+SuKFDh8ratWudLpbrPfLII+Zv/J577nG6KK4zb948c+8qbr179xa3I/i4iK44r6vN61D2999/XwYMGCCjR4+Wr776yumiuc7JkyfN/dMgifp76623zMLA77zzjllMuLS0VEaNGmXuL+pGZ4zXN+lt27bJe++9J1dddZV897vflY8//phbWU9bt26VX/7ylyZQon769u0rBw8e9G9///vfXX8rGc7uIlrDo5+un3zySf+6Yrp+yl133SWzZ892uniupZ9iXnzxRVNbgYYpLCw0NT8aiK688kpuZwN16NBBFixYILfddhv3so5OnDghF198sTz11FPy4IMPykUXXSSLFy/mPtaxxkdrw7dv3y5eQo2PS5SUlJhPgiNHjgxYV0z3N2/e7GjZAJ+ioiL/Gzbq78yZM7Jq1SpTc6ZNXqg7rYkcO3ZswP+ZqLv8/HzTJaB79+5y0003yd69e11/G1mk1CUOHz5s/jPs1KlTwHHd//TTTx0rF+CjNZDaj+Lyyy+Xfv36cWPq4aOPPjJB51//+pe0adPG1ET26dOHe1lHGhq1O4A2daFhrQzLly+XlJQU08w1f/58GT58uOTm5pp+fW5F8AEQsk/Y+h+iF/oAOEXfYLRZQWvO/vCHP8iUKVNMsyHhp/b27dsnP/rRj0yfMx0EgvobM2aM/2vtJ6VB6Pzzz5ecnBxXN78SfFwiPj5emjdvLgUFBQHHdT8xMdGxcgHqzjvvlJdfftmMltNOuqifqKgo6dGjh/l60KBBpsZiyZIlpoMuake7BOiAD+3f46O15fq7qf0ji4uLzf+lqLv27dtLr169ZOfOneJm9PFx0X+I+h/hm2++GdC0oPv0AYBTysvLTejRJpn169dLt27d+GGEkP6N6xs1au/qq682TYZac+bbLrnkEtM/Rb8m9DSsw/iuXbvk3HPPFTejxsdFdCi7Vn3rH/HgwYPNCAXt/Hjrrbc6XTRX/gFX/NSye/du85+idsrt0qWLo2VzW/PWCy+8IH/+859Nm/+hQ4fM8djYWGnVqpXTxXOVzMxM07Sgv3/Hjx8393Xjxo3y2muvOV00V9Hfw8p9zGJiYuScc86h71kdzZo1y8x3ps1bBw4cMFOpaHCcNGmSuBnBx0UmTpxohgvPmTPHvMHo8MxXX321SodnnJ3OkzJixIiAUKk0WGpnPtR+0j2VmpoacHzZsmUydepUbmMdaPPMLbfcYjqRanDUPhUaer7zne9wH+GI/fv3m5Dz9ddfS0JCglxxxRVmzi792s2YxwcAAFiDPj4AAMAaBB8AAGANgg8AALAGwQcAAFiD4AMAAKxB8AEAANYg+AAAAGsQfAAAgDUIPgCqpbNY68KEtpo3b56ZId0rmjVrJmvWrHG6GICjCD6Ah2zevNmspTN27Ng6P7dr165m/bfKy6R89tln0ph0PSp9Qz569KiE41pFFRcGbkzHjh2Te++9V3r37i0tW7aUxMREGTlypPzpT38yi8ECCA3W6gI85Nlnn5W77rrLPOqigp07d27Q6+lCo15cbLSkpESioqLOel2bNm3M1tg09Ok6SEVFRfLggw/KpZdeKi1atJC33npLfvzjH8tVV11ldc0bEErU+AAeWnF+9erVMmPGDFPjE2yx1Zdeesm8qWqNQnx8vHzve9/zLzL6xRdfSHp6uql90a1yU5fW/OjxTz/9NOA1Fy1aJBdccIF/Pzc316wyroFBF9C9+eab5fDhw/X+voqLi03NS1JSkllle8iQIaaWyEcXUNSFFPV869at5cILL5SVK1cGvIZ+f3feeafcc8895vsePXq0v6ZJa3QuueQS89xhw4ZJXl5etU1duvDquHHj5Oc//7mce+65ZsVvXaG+tLTUf40uMqr3XwNjt27dzCrrwWrTKvrpT38qe/bskXfffdcslNunTx/p1auXTJ8+XbZv327u5f333x90dXEt33333efff+6556Rv374SHR1tyqjfd3X27dsnEyZMMD/jDh06yHe/+11TDsDLCD6AR+Tk5JhmkpSUFJk8ebJ5A6zYRPLKK6+YoHPttdfKP/7xD/OGP3jwYHNOm1POO+888+aqb9y6VaZvxBoQfve73wUc1/3vf//7/poLrZ0YOHCgvPfee/Lqq69KQUGBeXOtL33j1ia8VatWyYcffijjx4+Xa665RvLz8835f/3rXzJo0CDz/Wno+sEPfmDC1pYtWwJe5/nnnze1PP/7v/8rS5cu9R/X5qVf/OIXprxayzJt2rQay7NhwwbZtWuXedTX1HBYMWTqCuta26bB6o9//KM888wzZuX16pSVlZnv7aabbgpaQ6ehx1euHTt2yNatW/3n9Oeo9+TWW281+08//bQJYnoPPvroI/nLX/4iPXr0CPrvaljTANi2bVv529/+Zu6L/lt6b7VGDPCscgCeMGzYsPLFixebr0tLS8vj4+PLN2zY4D8/dOjQ8ptuuqna559//vnlixYtCji2bNmy8tjYWP++nr/gggv8+3l5eZqsynfs2GH2H3jggfJRo0YFvMa+ffvMNXptMFpGPX/kyJEq57744ovy5s2bl3/55ZcBx6+++uryzMzMar+XsWPHls+cOdO//+1vf7t84MCBQf/dN954w3/slVdeMcf++c9/mv25c+eWDxgwwH9+ypQp5j6dPn3af2z8+PHlEydONF/rfdDnb9261X8+Pz/fHKt8b30KCgrM+YULF5afzZgxY8pnzJjh37/rrrvKU1NT/fudO3cuv/fee6t9vv47L774ovn6t7/9bXlKSkp5WVmZ/3xxcXF5q1atyl977bWzlgVwK2p8AA/Q5hmt4dAmH6U1BNoxWfv6+GiTydVXX92gf+fGG280TSHvvPOOv7bn4osvNjVN6oMPPjA1Ib6+Mbr5zmktSV1prcWZM2dMbVPF19S+L77X0/MPPPCAaeLS5ho9/9prr8nevXsDXktrhYLp37+//2ttGlI11dBoM5J2IK/4HN/1+nPQe6/3xEdrXOLi4qp9vbp0XNamL23G01ourZXRZjRfDZWWQWuaavsz1p/Vzp07TY2P777q/dPXrs/PCnALOjcDHqAB5/Tp0wFNJfqGqv08nnzySYmNjQ1JJ2UdaaRNWfqGe9lll5lH7VNUsZ/RddddJ48++miV5/pCRV3o62nI2LZtW0DYUL5OxwsWLJAlS5aYPjQafrQfkPblqdxco8eDiYyM9H/t69ukzU/VqXi97zk1XX82CQkJpo9N5b5Twei91Z/piy++aJrttLnqv//7v825uv589d5qGKzcdOkrE+BV1PgALqeB5ze/+Y3pp6K1Or5NP9FrEPJ19NWajZqGZusbqdaenI32RdFO1Nrv5vPPPze1QD5a0/Hxxx+bzrxa01Fxqy541ET7CmmZtDaj8utpCFPaN0U75Wq/pgEDBkj37t0bfQh+dbR/lf48tO+Nj9aqHDlypNrnREREmHuoAURrbIIFFH1NpbVJ2vl52bJlZtPn+QKP1tzofa/t8Hv9WWk/qY4dO1a5txqUAa8i+AAu9/LLL5s31ttuu82M+qm43XDDDf7mrrlz55oQpI/aSVabkSrWzOib5ttvvy1ffvlljaOwrr/+ejl+/Lip6RkxYkRALZN2rP3mm29Mk5t2wtUmE2120s63ZwtVWp7KwU2buDRoaYdh7YC9e/du06SXlZVlOjOrnj17yrp162TTpk3m+7r99ttNh2onaLOezr2jnYu1nBqA9GsNJ77apGAeeughSU5ONiPWNMR+8sknJpRoB3UNfxp+fP7nf/5H1q9fbzqOV+6IraPQNAA//vjj5vnvv/++PPHEE0H/Tb2vOsJNQ6N2btZ7qx2y7777btm/f38I7woQXgg+gMtpsNE322Cf0jX46GglHfmjQ7p///vfm5E+OgRam6wqjnzSEV3af0eHptfU1KE1C9rkosFE3zwr0hCkNTAackaNGmWanrTZSZtytGajJldeeaV5k/dtvj45WrOhwWfmzJmmRkWHk2uo6tKlizn/s5/9zNRe6Agl/R61JkivcYoGFx3Gr9+PjqLTfjl6z3QKgepo3xrtN6W1VjqPj37/w4cPN0FVm/Iq/mw16Omwew1ZGpQq0togbfJ76qmnTF+k//qv//KPfqtMh+9r0NX7qGH2W9/6lgnP2senXbt2IbwjQHhppj2cnS4EAHiV1p5obc4bb7zR4M7lSv/L1vBzxx13SEZGRkjKCNiEzs0AEELaDKVNU1rbpfMh6czL2oyoNUANVVhYaOb8OXTokH/uHgB1Q/ABgBDSkVY6E7N2/NYmLm2W0o7LlUeD1Yd2RNZ+OTopYk1D5AFUj6YuAABgDTo3AwAAaxB8AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg+ADAACsQfABAABii/8DEcVhOX6xFgwAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "ax = sns.boxplot(x=\"cycle\", y=value_column, data=res_df,color=\"lightblue\")\n",
    "ax.set_xlabel(\"Active Learning Cycle\");"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1faec23",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Now let's plot the distributions for the top 100 molecules from the input data vs the top 100 found using active learning."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "bf64efc0",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjIAAAGwCAYAAACzXI8XAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjcsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvTLEjVAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAKIZJREFUeJzt3QlwFGX6x/GHKxFCCEfCFcGAIhAuEVkVD0AOF5GF1TKAIJflgVEERN24CioK6IpEIEu80cUAWRVltdBCBNFFlEMQFAMKQrgkakggSIIw/3re//ZUEhIMMJPpt/P9VI3J9PTMvDPSnV+/ZyWfz+cTAAAAC1UOdQEAAADOFEEGAABYiyADAACsRZABAADWIsgAAABrEWQAAIC1CDIAAMBaVcXjTpw4IXv37pXIyEipVKlSqIsDAADKQKe5O3TokDRu3FgqV65ccYOMhpgmTZqEuhgAAOAMZGZmyrnnnltxg4zWxDhfRK1atUJdHAAAUAa5ubmmIsL5O15hg4zTnKQhhiADAIBd/qhbCJ19AQCAtQgyAADAWgQZAABgLYIMAACwFkEGAABYiyADAACsFdIgs3LlSunXr5+ZtU+HV73zzjsnzeo3ceJEadSokVSvXl169uwp27ZtC1l5AQCAu4Q0yOTl5UmHDh0kJSWlxMeffvppmTlzpqSmpsoXX3whERERcu2118rRo0fLvawAAMB9QjohXp8+fcytJFobk5ycLA8//LD079/fbHv99delQYMGpuZm0KBBJT4vPz/f3ArPDAgAALzJtX1kduzYIfv37zfNSY6oqCi59NJL5fPPPy/1eVOnTjX7OTfWWao4dIHQ7du3y8aNG81PvQ/AGzi+Yd0SBRpilNbAFKb3ncdKkpSUJOPHjz9prQZ42+bNm2XJkiWSnZ3t31anTh1T49e2bduQlg3A2eH4hpVB5kyFh4ebGyrWSW7+/Ply4YUXypVXXinVqlWTY8eOydatW832wYMHE2YAS3F8w9og07BhQ/Pzp59+MqOWHHr/oosuCmHJ4LbqZq2J0ZFv+m8jIyPD/1jt2rXNdn08Pj5eKld2bUsqgBJwfMPqINOsWTMTZpYtW+YPLtpMpKOXRo8eHeriwSV+/PFH05ykt1atWplO4Nr8qKFmxYoV8t133/n3a968eaiLC+A0cHyjLEJ6iXr48GHZsGGDuTkdfPX3Xbt2mXllxo4dK0888YQsXrxYNm3aJMOGDTNX2AMGDAhlseEiOTk55qc2Kw0dOlSaNm1qmhb1p97X7YX3A2APjm+4vkZm7dq10r17d/99p5Pu8OHDZe7cufLAAw+YuWZuv/12OXjwoOn/8MEHH8g555wTwlLDTfTfh2rTps1JTUd6X5uUtK+Msx8Ae3B8w/VBplu3bma+mNJorczjjz9ubkBJdJJE9c0330inTp2KhBltX9fthfcDYA+Ob5QFvR9hNZ0rSGmty7x580yzpE6IqD/1vrOkhbMfAHtwfMPqzr5AWcTFxZn5YmrUqGHmF9LlLAqPWoqNjZUjR46Y/QDYheMbZUGQgdW0KUknvdP5Ylq2bClXXXWVfx4ZrY3R4dg6jwxDrwH7cHyjLCr5TtVJxQN0yLZWT2rv91q1aoW6OAgSZv4EvIvju2LKLePfb4IMPEM79+q8E4cOHZLIyEhTLU1NDOANHN8VT24ZgwxNS/AMDS1Megd4E8c3SsOoJQAAYC2CDAAAsBZNS/AM2tABoOIhyMATGNUAABUTQQaeCDHOPDIDBw4ssvq1btd5ZNq2bRvqYgIAgoA+MrC+OWnJkiUmxJS0+rVu18d1PwCA91AjA6vpvDHZ2dmmJqak1a91YVJdtkD3Y2g2KqqCggLJysoKdTFQTExMjISFhfG9nCWCDKymk98pbU4qibPd2Q+oiDTEpKSkhLoYKCYxMdGsB4ezQ5CB1XQGX6V9YrQ5qTjdXng/oCLSK3/9o+mFQJaeni4JCQnmM9nOC5/BDQgy8MTquNqxV/vEFG5e0n4xul0fZ/VrVGTafOGlK38NAF76PDg7dPaFJ1bH1VWu582bJ7t27ZL8/HzzU+/rdn2cNZcAwJuokYH1dGi1DrHW0UnasdehNTEMvQYAbyPIwDNhJj4+ntWvAaCCIcjAM1gdFwAqHvrIAAAAaxFkAACAtQgyAADAWgQZAABgLYIMAACwFqOW4Bk6k68uDqnrKumSBDqbLxPhAYC3EWTgCZs3bzYT4ulK2IUnxNNZfXWOGQCANxFk4IkQM3/+fGnZsqUMHDjQrHiti0XqOku6ndl9AcC76CMD65uTtCZGQ4wuGqkrYIeHh5ufel+36+O6HwDAewgysJr2idHmpG7dup3UH0bv63Z9XPcDAHgPQQZW0469SpuTSuJsd/YDAHgLQQZW09FJSvvElMTZ7uwHAPAWggyspkOsdXSSduwt3g9G7+t2fVz3AwB4D0EGVtN+MDrEOiMjQ+bNmye7du2S/Px881Pv63Z9nPlkAMCbGH4N6+k8MTrEWkcnpaam+rdrTQxDrwHA2wgy8EyYiY+PZ2ZfAKhgCDLwDG0+at68eaiLAQAoR/SRAQAA1iLIAAAAaxFkAACAtQgyAADAWgQZAABgLYIMAACwFkEGAABYiyADAACsRZABAADWIsgAAABrEWQAAIC1CDIAAMBaBBkAAGAtggwAALAWQQYAAFiLIAMAAKxFkAEAANYiyAAAAGtVDXUBgEA5ceKE/Pjjj3Lo0CGJjIyUuLg4qVyZrA4AXkaQgSds3rxZlixZItnZ2f5tderUkT59+kjbtm1DWjYAQPAQZOCJEDN//nxp2bKlDBw4UBo0aCA//fSTrFixwmwfPHgwYQYAPIp6d1hNm5O0JkZDzNChQ6Vp06YSHh5ufup93a6P634AAO8hyMBq2idGm5O6det2Un8Yva/b9XHdDwDgPQQZWE079iptTiqJs93ZDwDgLQQZWE1HJyntE1MSZ7uzHwDAWwgysJoOsdbRSdqxt3g/GL2v2/Vx3Q8A4D0EGVhN+8HoEOuMjAyZN2+e7Nq1S/Lz881Pva/b9XHmkwEAb2L4Nayn88ToEGsdnZSamurfrjUxDL0GAG8jyMAzYSY+Pp6ZfQGggiHIwCgoKJCsrCzrvw2dQ0Zvat++fWK7mJgYCQsLC3UxAMC1CDIwNMSkpKTwbbhMYmKixMbGhroYAOBaBBn4r/z1j6YXAll6erokJCSYz2Q7L3wGAKiwQeb48ePy6KOPmtEn+/fvl8aNG8uIESPk4YcflkqVKoW6eJ6izRdeuvLXAOClzwMAsDDIPPXUUzJnzhx57bXXpE2bNrJ27VoZOXKkREVFyZgxY0JdPAAAEGKuDjKrVq2S/v37S9++fc19ndRMVzP+8ssvQ100AADgAq6eEK9Lly6ybNky2bp1q7m/ceNG+eyzz8wEZ6XRydByc3OL3AAAgDe5ukbmb3/7mwkirVq1kipVqpg+M08++aQMGTKk1OdMnTpVHnvssXItJwAACA1X18jo6JM33nhD0tLSZP369aavzDPPPGN+liYpKUlycnL8t8zMzHItMwAAKD+urpG5//77Ta3MoEGDzP127drJzp07Ta3L8OHD/3BCNAAA4G2urpE5cuTISYv9aRNT8VWOAQBAxeTqGpl+/fqZPjFNmzY1w6+/+uorefbZZ2XUqFGhLhoAAHABVweZWbNmySOPPCJ33XWXHDhwwEyId8cdd8jEiRNDXTQAAOACrg4ykZGRkpycbG4AAABW9ZEBAAA4FYIMAACwFkEGAABYiyADAACsRZABAADWIsgAAABrEWQAAIC1CDIAAMBaBBkAAGAtggwAALAWQQYAAFiLIAMAAKxFkAEAANYiyAAAAGsRZAAAgLUIMgAAwFoEGQAAYC2CDAAAsBZBBgAAWIsgAwAArEWQAQAA1iLIAAAAaxFkAACAtQgyAADAWgQZAABgLYIMAACwFkEGAABYiyADAACsRZABAADWIsgAAABrEWQAAIC1CDIAAMBaBBkAAGAtggwAALAWQQYAAFiLIAMAAKxFkAEAANYiyAAAAGsRZAAAgLUIMgAAwFoEGQAAYC2CDAAAsBZBBgAAWIsgAwAArEWQAQAA1iLIAAAAaxFkAACAtQgyAADAWgQZAABgLYIMAACwFkEGAABYiyADAACsRZABAADWIsgAAABrEWQAAIC1CDIAAMBaBBkAAGAtggwAALAWQQYAAFiLIAMAAKxFkAEAANYiyAAAgIoTZCZNmiQ7d+4MTmkAAACCGWTeffddOf/886VHjx6SlpYm+fn5p/sSAAAAoQkyGzZskDVr1kibNm3k3nvvlYYNG8ro0aPNNgAAANf3kenYsaPMnDlT9u7dKy+//LLs3r1brrjiCmnfvr0899xzkpOTE/iSAgAABLKzr8/nk2PHjklBQYH5vU6dOjJ79mxp0qSJLFy48GxeGgAAIDhBZt26dXL33XdLo0aNZNy4caaGZsuWLfLJJ5/Itm3b5Mknn5QxY8acyUsDAAAEL8i0a9dOLrvsMtmxY4dpVsrMzJRp06bJBRdc4N9n8ODBkpWVdbovDQAAcFqqnt7uIgkJCTJq1CiJjY0tdZ/o6Gg5ceLE6b40AABAcGtknL4wxf3222/y+OOPS6Dt2bNHhg4dKvXq1ZPq1aubGqG1a9cG/H0AAEAFCDKPPfaYHD58+KTtR44cMY8FUnZ2thkNVa1aNVmyZIl8++23Mn369BKDFAAAqHiqnkmNTKVKlU7avnHjRqlbt64E0lNPPWVGQL366qv+bc2aNTvlc3SCvsKT9OXm5ga0TAAAwMIaGa0F0aCiIebCCy80vzu3qKgo6dWrl+k/E0iLFy+WSy65RG666SapX7++GR314osvnvI5U6dONeVxbhqEAABABa+RSU5ONrUx2tFXm5A0JDjCwsIkLi5OLr/88oAWbvv27TJnzhwZP368PPTQQ2b2YB3Wre83fPjwEp+TlJRk9i9cI0OYAQCgggcZJzho006XLl1Mv5Vg05FPWiMzZcoUc19rZDZv3iypqamlBpnw8HBzAwAA3lempqXC/Uw0TOgIJd1W0i2QdMK9+Pj4Ittat24tu3btCuj7AAAAD9fIaP+Yffv2mX4qtWvXLrGzr9MJ+Pjx4wErnI5YysjIKLJt69atct555wXsPQAAgMeDzMcff+wfkaS/lxRkgkGXP9BmLG1a0o7EX375pbzwwgvmBgAAUKYg07VrV//v3bp1K7dvrXPnzrJo0SLTgVcn29P+OdrpeMiQIeVWBgAA4KF5ZFq0aGGChN7092C7/vrrzc3tDh48KHl5eaEuRoXnrPHFWl/uEBERYZqjAcA1Qeauu+6StLQ0mTx5slx88cVm+YCBAwdKw4YNpaLSEPPsjBny+7FjoS4K/ic9PZ3vwgWqVqsm48eNI8wAcE+Q0X4retNOt2+88YakpKTIhAkTpHv37ibUDBs2TCoarYnRENOiS3epEcXyCYA6kpMt21YtN8cHtTIAXBNkHDq7r06Mp7fVq1fL6NGjZeTIkRUyyDg0xNSsGx3qYgAAUGGccZBROopIm5kWLlxo5pDRpQQAAABcG2ScJqX58+fLjh075JprrjGLO95www1Ss2bN4JQSAAAgEEGmVatWZlh0YmKiDBo0SBo0aHC6LwEAABCaIKMz7ZbHsGsAAICArLVUGCEGAABYVSOjyxNo35jo6Giz7tKplij49ddfA1k+AACAswsyM2bMkMjISP/v5bXWEgAAwFkHmeHDh/t/HzFiRFmeAgAA4L4+MlWqVJEDBw6ctP2XX34xjwEAALg2yPh8vhK35+fnS1hYWCDKBAAAENjh1zNnzjQ/tX/MSy+9VGTyu+PHj8vKlSvNHDMAAACuCzLaydepkUlNTS3SjKQ1MXFxcWY7AACA64KMLkegdJXrt99+2wzDBgAAsGpm3+XLlwenJAAAAMHu7HvjjTeaRSKLe/rpp1n9GgAAuDvIaKfe66677qTtffr0MY8BAAC4NsgcPny4xGHW1apVk9zc3ECVCwAAIPB9ZNq1aycLFy6UiRMnFtm+YMECiY+PP92XAwDXO3jwoOTl5YW6GBVeVlZWkZ8IrYiICKldu7Z9QeaRRx6RG264QX744Qe55pprzLZly5ZJWlqavPnmm8EoIwCENMQ8O2OG/H7sGP8XXCI9PT3URYCIVK1WTcaPGxfyMHPaQaZfv37yzjvvyJQpU0xwqV69unTo0EE+/vhjs0o2AHiJ1sRoiGnRpbvUiGLaCUAdycmWbauWm+PDuiCj+vbta25K+8XMnz9fJkyYIOvWrTOz/AKA12iIqVk3OtTFAHC2nX0dOkJJV8Vu3LixTJ8+3TQzrV69+kxfDgAAILg1Mvv375e5c+fKyy+/bGpiEhISzGKR2tRER18AAODaGhntG9OyZUv5+uuvJTk5Wfbu3SuzZs0KbukAAAACUSOzZMkSGTNmjIwePVpatGhR1qcBAACEvkbms88+k0OHDkmnTp3k0ksvldmzZ8vPP/8cvJIBAAAEKshcdtll8uKLL8q+ffvkjjvuMBPgaUffEydOyNKlS03IAQAAcPWoJZ3Jb9SoUaaGZtOmTXLffffJtGnTpH79+vKXv/wlOKUEAAAI5PBrpZ1/ddXr3bt3m7lkAAAArAkyjipVqsiAAQNk8eLFgXg5AACA4M3si9KnbAbA8QCg/BBkAkjXnQAAAOWHIBNALCoHnLyoHAAEE0EmgFhUDgAACzv7AgAAhAJBBgAAWIsgAwAArEWQAQAA1iLIAAAAaxFkAACAtQgyAADAWgQZAABgLYIMAACwFkEGAABYiyADAACsRZABAADWIsgAAABrEWQAAIC1CDIAAMBaBBkAAGAtggwAALAWQQYAAFiLIAMAAKxFkAEAANYiyAAAAGsRZAAAgLUIMgAAwFoEGQAAYC2CDAAAsBZBBgAAWIsgAwAArEWQAQAA1qoa6gJ4yZGc7FAXAXANjgcA5YEgEwARERFStVo12bZqeSBeDvAMPS70+ACAYCHIBEDt2rVl/LhxkpeXF4iXw1nIysqS9PR0SUhIkJiYGL7LENMQo8cHAAQLQSZA9GTNCds9NMTExsaGuhgAgCCzqrPvtGnTpFKlSjJ27NhQFwUAALiANUFmzZo18vzzz0v79u1DXRQAAOASVgSZw4cPy5AhQ+TFF1+UOnXqhLo4AADAJawIMomJidK3b1/p2bPnH+6bn58vubm5RW4AAMCbXN/Zd8GCBbJ+/XrTtFQWU6dOlcceeyzo5QJQsTAvDuDO48HVQSYzM1PuvfdeWbp0qZxzzjllek5SUpKMHz/ef19rZJo0aRLEUgKoCJgnCnAnVweZdevWyYEDB+Tiiy/2bzt+/LisXLlSZs+ebZqRqlSpUuQ54eHh5gYAgdSiS3epEUUfPcCpkXFLuHd1kOnRo4ds2rSpyLaRI0dKq1at5MEHHzwpxABAsGiIqVk3mi8YcBlXB5nIyEhp27btSTOF1qtX76TtAACg4rFi1BIAAIB1NTIlWbFiRaiLAAAAXIIaGQAAYC2CDAAAsBZBBgAAWMu6PjIIjoKCAsnKyrL+63U+gxc+i4qJiZGwsLBQFwMAXIsgA/8f/pSUFM98G+np6eIFus5YbGxsqIsBAK5FkIH/yl//aMJ9/18AAKUjyMDQ5guu/AEAtqGzLwAAsBZBBgAAWIsgAwAArEWQAQAA1iLIAAAAaxFkAACAtQgyAADAWgQZAABgLYIMAACwFkEGAABYiyADAACsRZABAADWIsgAAABrEWQAAIC1CDIAAMBaBBkAAGAtggwAALAWQQYAAFiLIAMAAKxFkAEAANYiyAAAAGsRZAAAgLUIMgAAwFoEGQAAYC2CDAAAsBZBBgAAWIsgAwAArEWQAQAA1iLIAAAAaxFkAACAtQgyAADAWgQZAABgLYIMAACwFkEGAABYiyADAACsRZABAADWIsgAAABrEWQAAIC1CDIAAMBaBBkAAGAtggwAALAWQQYAAFiLIAMAAKxFkAEAANYiyAAAAGsRZAAAgLUIMgAAwFoEGQAAYC2CDAAAsBZBBgAAWIsgAwAArEWQAQAA1iLIAAAAaxFkAACAtaqGugAAYIMjOdmhLgLgGkdcdDwQZADgFCIiIqRqtWqybdVyviegED0u9PgINYIMAJxC7dq1Zfy4cZKXl8f3FGJZWVmSnp4uCQkJEhMTE+riVHgRERHm+Ag1ggwA/AE9WbvhhI3/pyEmNjaWrwMGnX0BAIC1CDIAAMBaBBkAAGAtggwAALCWq4PM1KlTpXPnzhIZGSn169eXAQMGSEZGRqiLBQAAXMLVQeaTTz6RxMREWb16tSxdulSOHTsmvXv3ZhgkAABw//DrDz74oMj9uXPnmpqZdevWydVXXx2ycgEAAHdwdZApLicnx/ysW7duqfvk5+ebmyM3N7dcygYAAMqfq5uWCjtx4oSMHTtWrrjiCmnbtu0p+9VERUX5b02aNCnXcgIAgPJjTZDRvjKbN2+WBQsWnHK/pKQkU3Pj3DIzM8utjAAAoHxZ0bR09913y3vvvScrV66Uc88995T7hoeHmxsAAPA+VwcZn88n99xzjyxatEhWrFghzZo1C3WRAACAi1R1e3NSWlqavPvuu2Yumf3795vt2velevXqoS4eAAAIMVf3kZkzZ47p59KtWzdp1KiR/7Zw4cJQFw0AALiA65uWAAAArKyRAQAAOBWCDAAAsBZBBgAAWIsgAwAArEWQAQAA1iLIAAAAaxFkAACAtQgyAADAWgQZAABgLYIMAACwFkEGAABYy9VrLQEAzl5BQYFkZWVZ/1U6n8ELn0XFxMRIWFhYqIthPYIMAHic/uFPSUkRr0hPTxcvSExMlNjY2FAXw3oEGQDwOL3y1z+acN//F5w9ggwAeJw2X3DlD6+isy8AALAWQQYAAFiLIAMAAKxFkAEAANYiyAAAAGsRZAAAgLUIMgAAwFoEGQAAYC2CDAAAsBZBBgAAWIsgAwAArEWQAQAA1iLIAAAAa3l+9Wufz2d+5ubmhrooAACgjJy/287f8QobZA4dOmR+NmnSJNRFAQAAZ/B3PCoqqtTHK/n+KOpY7sSJE7J3716JjIyUSpUqhbo4KIcEr6E1MzNTatWqxfcNeAjHd8Xi8/lMiGncuLFUrly54tbI6Ic/99xzQ10MlDMNMQQZwJs4viuOqFPUxDjo7AsAAKxFkAEAANYiyMBTwsPDZdKkSeYnAG/h+EaF7OwLAAC8ixoZAABgLYIMAACwFkEGAABYiyADV/nvf/8r7dq1k2rVqsmAAQNCXRwAJZg7d67Url27wn43jz76qFx00UWhLgb+hyCDgBkxYoSZPVlvGkSaNWsmDzzwgBw9erTMrzF+/HhzgtixY4c5WQI4e59//rlUqVJF+vbte9rPjYuLk+Tk5CLbBg4cKFu3bg3q/5oVK1aYc8nBgwfFbSZMmCDLli0LdTHwPwQZBNSf//xn2bdvn2zfvl1mzJghzz//vBkOXVY//PCDXHPNNWY25jO94isoKDij5wFe9fLLL8s999wjK1euNEu2nK3q1atL/fr1xWvKeu6oWbOm1KtXL+jlQdkQZBDweR4aNmxo1jvSpqGePXvK0qVL/eteTZ061dTU6ImwQ4cO8uabb5rHfvzxR3P19csvv8ioUaPM706NzObNm6VPnz7m5NGgQQO55ZZb5Oeff/a/Z7du3eTuu++WsWPHSnR0tFx77bVlft6YMWNMrVHdunVNubXKuDC9GrzjjjvM88855xxp27atvPfee/7HP/vsM7nqqqvM59HPrK+Xl5fHvyq4xuHDh2XhwoUyevRoUyNTUk3nf/7zH+ncubP5N67H0F//+lf/MbJz504ZN26cv7a1eNOS1szo9u+++67Ia+qFzPnnn++//0fH4+nKz883NSOxsbESEREhl156qanFcei5ZPDgwebxGjVqmCbr+fPnF3mNks4dTk2Q1rhccskl5rldunSRjIyMUpuWtDZaz3fPPPOMNGrUyIScxMREOXbsmH8fvcDT71/PFXoOTEtLK7G2C6ePIIOg0RPXqlWrJCwszNzXEPP6669LamqqfPPNN+bkOHToUPnkk09MCNADXddQ0QNbf9fqaw0SWkPTsWNHWbt2rXzwwQfy008/SUJCQpH3eu2118z7aB8bff3TeZ6eBL/44gt5+umn5fHHHy8SvPTEq685b948+fbbb2XatGmmit6pPdIaqBtvvFG+/vpr88dCg42eGAG3SE9Pl1atWknLli3N8fbKK6+Yxfgc77//vgku1113nXz11VfmD/if/vQn89jbb79takf1uNBjUm/FXXjhheYP/htvvFFku96/+eabze9lPR5Phx5n2mS2YMECc/zddNNN5njctm2beVybtDt16mQ+n56Lbr/9dhOevvzyy1OeOxx///vfZfr06aa8VatWNRdYp7J8+XJzTtCf+poa9gqHxmHDhpnaMA1Kb731lrzwwgty4MCBM/78KEQnxAMCYfjw4b4qVar4IiIifOHh4Xqm9FWuXNn35ptv+o4ePeqrUaOGb9WqVUWec+utt/oGDx7svx8VFeV79dVX/fcnT57s6927d5HnZGZmmtfOyMgw97t27err2LFjkX3K+rwrr7yyyD6dO3f2Pfjgg+b3Dz/80JTf2b84Lfvtt99eZNunn35qnvPbb7+V4RsDgq9Lly6+5ORk8/uxY8d80dHRvuXLl/sfv/zyy31Dhgwp9fnnnXeeb8aMGUW26TGqx6pDHz///PP99/WY0WNty5YtZT4ei9My6uPZ2dknPbZz505zrtmzZ0+R7T169PAlJSWV+ln69u3ru++++/z3Szp3OO/70Ucf+be9//77ZptzXE+aNMnXoUOHIuc+/Z5+//13/7abbrrJN3DgQPO7fg/6/DVr1vgf37Ztm9lW/LvF6fP86tcoX927d5c5c+aY5hWtWtYrGa2x0BqYI0eOSK9evU5qk9artNJs3LjRXOFodXRxevWjV4NKr7zO5Hnt27cv8phWCztXSRs2bDBXo86+JZVNrwQLX4nqla7W5Ghn5datW5f6uYDyoM0hWgOxaNEic1+PR63p1D4z2qzi/Du/7bbbzup9Bg0aZJp5Vq9eLZdddpk5Ji6++GJTE3Q6x2NZbdq0SY4fP37S87S5yem7oo9PmTLF1Ejt2bPHnGv0cW0qKqz4ucNR+Nyg5wWl54amTZuWuH+bNm38tbXOc7Sczv8H/e71O3FccMEFUqdOndP63CgZQQYBpc00eoAqrcLWfjB60tS+JUqrebXNurBTrYuk7fv9+vWTp5566qTHnJOL875n8jwdXVWYto1rEFHaln0q+h7af0b7xRRX2skOKE967P3+++/SuHHjImFbj7nZs2dLVFTUH/47LwvtX6ZNR9rvQ4OM/tQ+Oad7PJaVvp6GhnXr1hUJD8oJS//4xz/kueeeM03V2j9GzxHaF6Z4h97i546Szg1O3yDn3PBH+zvPOdX+CByCDIKmcuXK8tBDD5kh1dohUE+eu3btkq5du5b5NfQKRtuTtVOcXtEE+3nFr8h2795tyl7SFaO+h/abcYIb4CYaYLRPmvbz6N27d5HHtGOqdny98847zb9z7RczcuTIEl9H+49o7cYfGTJkiOk4rx1sddSi1tIE8ngsTGtxtUxaQ6Kd7UuifV769+9v+gUpDRV6LMfHx0t50/5J+v9D+yA5NUDff/+9ZGdnl3tZvIjOvggq7YCnV0w6DFurnrWDr3aE0+rk9evXy6xZs8z90mjP/19//dWcHNesWWOe9+GHH5qT7qlOrmf6vMI0cF199dWmaUw7AGtz0ZIlS0xHRfXggw+azsza6VCr57WT4bvvvktnX7iCjq7TP5S33nqrqREtfNN/01pbo3R6BA01+nPLli2mOaRwzYmGDx22rc0zpxpldMMNN8ihQ4dMTYw2MReuBTqb41HLo8eXc9NmKr2w0OCkHWi1Q7Iem9qEpgMKtNZXtWjRwhy3eozq59LaU+1gHAraxKYjOLXDsZZTA43+rrVhTm0PzhxBBkGlV1/6h15HBCUlJckjjzxiTjbaf0RHGOhJR4cilkZPhnplpSc7varUKmKtHtahn1rjE+jnFadXkTosVU/AeiWnV5zOiVevZHXElV7l6VWhXiVOnDixyAkcCBUNKvrHU5uPitMgo6NxtI+X9pX597//LYsXLzZDirWJqPDIHh2xpNMj6FDqmJiYUt8vMjLSNB9p0NCQEajjUS8m9Nhybk6NxquvvmqCzH333WdqPLSWSUOS06z78MMPm5ogHVKtn1Gbv0I5W7jWjumwc/08OkpM+yXpd6ZD3nF2KmmP37N8DQAAcBq02Vqnnfjoo4+kR48efHdngSADAECQffzxx6aTstZG6Xw8WrurzXVao1u8ozBOD519AQAIMp3lVwc/aEdobVLS2YJ1mDoh5uxRIwMAAKxFZ18AAGAtggwAALAWQQYAAFiLIAMAAKxFkAEAANYiyAAAAGsRZAAExYgRI8w6MnrTuTJ0evZevXqZVdFPZ1XguXPnmqnsQ1H+UE5pD6BsCDIAgkbX09JZTHWtHl1wUxcTvPfee+X66683qwEDwNkiyAAImvDwcLNYX2xsrFnAT2c21RXCNdRoTYt69tlnzbTtERERZu2Zu+66y0zlrlasWGFWSM7JyfHX7jz66KPmsX/9619yySWXmFlS9T1uvvlmOXDggP+9deVnXbxQFzrUVYZ1NWRdaNCRmZkpCQkJpranbt260r9/fxO4lL6HrsquZXXeV8sCwH0IMgDKla6u3KFDB3n77bf//yRUubLMnDlTvvnmGxMedE0aXYdG6TTuycnJUqtWLVOzo7cJEyb4p3yfPHmyWW35nXfeMSFEm4McutL6t99+a0LTli1bZM6cORIdHe1/rq6KrCHo008/NSsz16xZ09QgFRQUmPfQkOPUKOlNywLAfVhrCUC5a9WqlXz99dfm97Fjx/q3x8XFyRNPPCF33nmn/POf/5SwsDCJiooyNSJa61LYqFGj/L83b97chKHOnTub2hwNJbt27ZKOHTuaWhvntR0LFy40/XReeukl89pKa2u0dkZrXnr37m1qcfLz8096XwDuQo0MgHLn8/n8AeKjjz6SHj16mOYnrSG55ZZb5JdffpEjR46c8jXWrVsn/fr1k6ZNm5rnde3a1WzXAKNGjx4tCxYskIsuusjU8Kxatcr/XK3F+f77783zNPToTZuXjh49Kj/88ENQPzuAwCLIACh32tTTrFkz0xykHX/bt28vb731lgknKSkpZh9t4ilNXl6eaRrSJiddQXjNmjWyaNGiIs/r06eP7Ny5U8aNGyd79+41YclpltJam06dOsmGDRuK3LZu3Wr62gCwB01LAMqV9oHZtGmTCRgaXLSJZ/r06aavjEpPTy+yvzYvHT9+vMi27777ztTaTJs2zXQQVmvXrj3pvbSj7/Dhw83tqquukvvvv1+eeeYZ0/FYm5fq169vwlBJSnpfAO5DjQyAoNE+Jvv375c9e/bI+vXrZcqUKWZ0kNbCDBs2TC644ALT8XbWrFmyfft2MxIpNTW1yGto3xatQVm2bJn8/PPPpslJm5M0aDjPW7x4sen4W9jEiRPNqCNtQtKOxO+99560bt3aPKajmbTjr5ZFO/vu2LHD9I0ZM2aM7N692/++2o8nIyPDvK+WE4AL+QAgCIYPH+7TU4zeqlat6ouJifH17NnT98orr/iOHz/u3+/ZZ5/1NWrUyFe9enXftdde63v99dfNc7Kzs/373Hnnnb569eqZ7ZMmTTLb0tLSfHFxcb7w8HDf5Zdf7lu8eLF5/KuvvjKPT5482de6dWvzunXr1vX179/ft337dv9r7tu3zzds2DBfdHS0eY3mzZv7brvtNl9OTo55/MCBA75evXr5atasaV53+fLl/DsBXKiS/ifUYQoAAOBM0LQEAACsRZABAADWIsgAAABrEWQAAIC1CDIAAMBaBBkAAGAtggwAALAWQQYAAFiLIAMAAKxFkAEAANYiyAAAALHV/wHQV9Y/qQ59QgAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Get a dataframe with the top 100 reference molecules\n",
    "ref_top_100 = ref_df.sort_values(\"Activity\",ascending=False).head(100).copy()\n",
    "ref_top_100['Dataset'] = 'Reference'\n",
    "# Get a dataframe with top 100 molecules predicted by active learning\n",
    "pred_top_100 = res_df.sort_values(\"Activity\",ascending=False).head(100).copy()\n",
    "pred_top_100['Dataset'] = 'Active Learning'\n",
    "# Make a boxplot comparing the scores of the top 100 molecules\n",
    "ax = sns.boxplot(x=\"Dataset\", y=\"Activity\", data=pd.concat([ref_top_100,pred_top_100]),color=\"lightblue\")\n",
    "ax.set_ylabel(\"Activity\");"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f607c2b",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "source": [
    "We can also compare the scores of the molecules selected by active learning with scores from randomly selected molecules. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "59ec5da3",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAj4AAAGwCAYAAACpYG+ZAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjcsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvTLEjVAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAL6NJREFUeJzt3Ql4VOX59/E7EBIhQCAkEIiBENkEZCmyKoUoZalVqRYEREAoisZSWarwr7JWIyqLYipaZbGVrVUp1WIFRKUFBKlURUBQlrAHhIRFE5Z5r/t5O9OZbBJIcmbO8/1c13FyzpyZeTI4J7951jCPx+MRAAAAC5RzugAAAABlheADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAaBB8AAGCNcKcLEGwuXrwoBw8elCpVqkhYWJjTxQEAAJdApyU8deqU1KlTR8qVK7xeh+CTh4aexMTES3mPAQBAkMnIyJCrr7660PsJPnloTY/3jatatWrp/usAAIASkZ2dbSouvH/HC0PwycPbvKWhh+ADAEBo+aFuKnRuBgAA1iD4AAAAaxB8AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg+ADAACsQfABAADWCKng89FHH8mtt95qFiDTmRmXLVuWb4GyCRMmSO3ataVixYrSrVs32blzp2PlBQAAwSWkgs+ZM2ekZcuWkp6eXuD9Tz/9tDz//PMyZ84c+fjjjyUqKkp69Ogh33//fZmXFQAABJ+QWqurV69eZiuI1vbMmjVLHnvsMbn99tvNsddee01q1aplaob69etXxqUFAADBJqRqfIqye/duOXz4sGne8oqOjpb27dvL+vXrC31cTk6OWdHVfwMAAO4UUjU+RdHQo7SGx5/ue+8rSFpamkyePLnUywcAoSI3N1cyMzOdLgb8xMXFSUREBO9JCXBN8Llc48ePl9GjR/v2tcYnMTHR0TIBgJM09BTWlxLOSE1NlYSEBN7+EuCa4BMfH29ujxw5YkZ1eel+q1atCn1cZGSk2QAA/6td0D+0bghwS5culb59+5rfKZSFevmDiWuCT/369U34Wb16tS/oaO2Nju564IEHnC4eAIQMbVJxU+2ChgY3/T6wKPicPn1adu3aFdChecuWLRITEyN169aVhx9+WH73u99Jw4YNTRB6/PHHzZw/vXv3drTcAAAgOIRU8Pnkk08kJSXFt+/tmzN48GCZP3++PPLII2aun/vuu09OnjwpN954o7z77rty1VVXOVhqAAAQLEIq+HTt2tXM11MYnc15ypQpZgMAAHDtPD4AAAA/hOADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAaBB8AAGANgg8AALAGwQcAAFiD4AMAAKxB8AEAANYg+AAAAGsQfAAAgDUIPgAAwBoEHwAAYA2CDwAAsAbBBwAAWIPgAwAArEHwAQAA1iD4AAAAaxB8AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg+ADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAa4U4XAKEpNzdXMjMznS4G/MTFxUlERATvCQAUgeCDy6KhJz09nXcviKSmpkpCQoLTxQCAoEbwwWXXLugfWjcEuKVLl0rfvn3N7xTKQr38AFAWCD64LNqk4qbaBQ0Nbvp9AAAFo3MzAACwhquCz6RJkyQsLCxga9KkidPFAgAAQcJ1TV3NmjWTVatW+fbDw133KwIAgMvkulSgQSc+Pv6Sz8/JyTGbV3Z2dimVDAAAOM1VTV1q586dUqdOHUlOTpa7775b9u3bV+T5aWlpEh0d7dsSExPLrKwAAKBsuSr4tG/fXubPny/vvvuuvPjii7J7927p3LmznDp1qtDHjB8/XrKysnxbRkZGmZYZAACUHVc1dfXq1cv3c4sWLUwQqlevnpmnZdiwYQU+JjIy0mwAAMD9XFXjk1e1atWkUaNGsmvXLqeLAgAAgoCrg8/p06fl66+/ltq1aztdFAAAEARcFXzGjh0rH374oezZs0fWrVsnP//5z6V8+fLSv39/p4sGAACCgKv6+Ozfv9+EnOPHj5slCG688UbZsGEDaxgBAAD3BZ/Fixc7XQQAABDEXNXUBQAAUBSCDwAAsAbBBwAAWIPgAwAArEHwAQAA1iD4AAAAaxB8AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg+ADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAa4U4XAADc5OTJk3LmzBmniwERyczMDLiFs6KioqRatWqO/zMQfACgBEPPjJkz5fy5c7ynQWTp0qVOFwEiEl6hgoweNcrx8EPwAYASojU9GnoadkqRStHVeV+B/zqbdUJ2rltjPiMEHwBwGQ09lWNinS4GgALQuRkAAFiD4AMAAKxB8AEAANYg+AAAAGsQfAAAgDUIPgAAwBoEHwAAYA2CDwAAsAbBBwAAWIPgAwAArEHwAQAA1iD4AAAAaxB8AACANQg+AADAGq4MPunp6ZKUlCRXXXWVtG/fXjZu3Oh0kQAAQBBwXfBZsmSJjB49WiZOnCj//ve/pWXLltKjRw85evSo00UDAAAOc13wmTFjhgwfPlzuvfdeadq0qcyZM0cqVaokc+fOdbpoAADAYa4KPrm5ubJ582bp1q2b71i5cuXM/vr16wt8TE5OjmRnZwdsAADAnVwVfI4dOyYXLlyQWrVqBRzX/cOHDxf4mLS0NImOjvZtiYmJZVRaAABQ1lwVfC7H+PHjJSsry7dlZGQ4XSQAAFBKwsVFYmNjpXz58nLkyJGA47ofHx9f4GMiIyPNBgAA3M9VNT4RERHSpk0bWb16te/YxYsXzX7Hjh0dLRsAAHCeq2p8lA5lHzx4sFx//fXSrl07mTVrlpw5c8aM8gIAAHZzXfC56667JDMzUyZMmGA6NLdq1UrefffdfB2eAaC0nM06wZsLBOlnwnXBRz300ENmAwAn7Fy3hjceCFKuDD4A4KSGnVKkUnR1/hEAvxqfYPlCQPABgBKmoadyTCzvKxCEXDWqCwAAoCgEHwAAYA2CDwAAsAbBBwAAWIPgAwAArEHwAQAA1iD4AAAAaxB8AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg+ADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAaBB8AAGANgg8AALAGwQcAAFiD4AMAAKxB8AEAANYg+AAAAGsQfAAAgDUIPgAAwBoEHwAAYA2CDwAAsAbBBwAAWIPgAwAArEHwAQAA1iD4AAAAa4Q7XQAAcJuzWSecLgIQVM4G0WeC4AMAJSQqKkrCK1SQnevW8J4CeehnQz8jTnNV8ElKSpK9e/cGHEtLS5Nx48Y5ViYA9qhWrZqMHjVKzpw543RRICKZmZmydOlS6du3r8TFxfGeOCwqKsp8RpzmquCjpkyZIsOHD/ftV6lSxdHyALCLXtiD4eKO/9HQk5CQwFsCdwYfDTrx8fFOFwMAAAQh143qeuqpp6RGjRrSunVreeaZZ+T8+fNFnp+TkyPZ2dkBGwAAcCdX1fiMHDlSfvSjH0lMTIysW7dOxo8fL4cOHZIZM2YU+hjtAzR58uQyLScAAHBG0Nf4aMfksLCwIrft27ebc0ePHi1du3aVFi1ayIgRI2T69Okye/ZsU6tTGA1HWVlZvi0jI6MMfzsAAFCWgr7GZ8yYMTJkyJAiz0lOTi7wePv27U1T1549e6Rx48YFnhMZGWk2AADgfuGh0Bv/cochbtmyRcqVKyc1a9Ys8XIBAIDQE/TB51KtX79ePv74Y0lJSTEju3R/1KhRMnDgQKlevbrTxQMAAEHANcFHm6sWL14skyZNMn166tevb4KP9vsBAABwVfDR0VwbNmxwuhgAACCIBf2oLgAAgJLimhqfUHLy5EnW8gmitXz8b+GsYFnLB4B7FTv4TJw4UYYOHSr16tUrnRK5nIaeGTNnyvlz55wuCvzoQoYIjtWbdZFPwg+AoAk+f/3rX+WJJ56QLl26yLBhw+TOO+9kHpxi0FWbNfQ07JQilaIZbQZ4nc06ITvXrTGfEYIPgKAJPjo3zqeffirz5s2TX//615Kamir9+vUztUBt27YtnVK6kIaeyjGxThcDAACrXFbnZl0A9Pnnn5eDBw/Kq6++Kvv375cbbrjBLBXx3HPPmaUfAAAAXDWqy+PxyLlz5yQ3N9f8rBMFvvDCC5KYmChLliwpuVICAAA4FXw2b94sDz30kNSuXdtMEqg1QNu2bZMPP/xQdu7cafoA6UrpAAAAIR18rrvuOunQoYPs3r3bNHPpauZPPfWUNGjQwHdO//79GR4MAABCv3Nz3759TUfmhISEQs+JjY2VixcvXmnZAAAAnK3x8fblyeu7776TKVOmlFS5AAAAnA8+kydPltOnT+c7fvbsWXMfAACAq2p8wsLC8h3/z3/+IzExMSVVLgAAAOf6+GjzlgYe3Ro1ahQQfi5cuGBqgUaMGFHyJQQAACjr4DNr1ixT26Mdm7VJKzo62ndfRESEJCUlSceOHUuqXAAAAM4Fn8GDB5vb+vXrS6dOnaRChQolXxoAAACng092drZUrVrV/KyTFeoILt0K4j0PAAAgJIOP9u85dOiQ1KxZ06yaXFDnZm+nZ+3vAwAAELLB5/333/eN2NKfCwo+AAAArgg+Xbp08f3ctWvX0iwPAABA8Mzj07BhQ5k0aZJZjBQAAMDVwefBBx+Ud955R5o0aSJt27aV5557Tg4fPlw6pQMAAHAy+IwaNUo2bdok27Ztk5/+9KeSnp4uiYmJ0r17d3nttddKsmwAAADOBh8vnb1ZJzL86quvZO3atZKZmSn33ntvyZYOAADAiQkMC7Jx40ZZuHChLFmyxMz106dPn5IrGQAAgNPBR2t4Xn/9dVm0aJHs3r1bbrrpJpk2bZrccccdUrly5ZIuHwAAgHPBx9upOTU1Vfr16ye1atUqudIAAAAEU/DZsWOHGdIOAABgxTw+AAAArq3x0eUqtG9PbGysWberqCUrvv3225IsHwAAQNkGn5kzZ0qVKlV8P7NWFwAAcG3wGTx4sO/nIUOGlGZ5AAAAgqePT/ny5eXo0aP5jh8/ftzcBwAA4Jrg4/F4Cjyek5MjERERJVEmAAAAZ4ezP//88+ZW+/e88sorAZMVXrhwQT766CMzxw8AAEDIBx/t1Oyt8ZkzZ05As5bW9CQlJZnjpeWJJ54wq8Jv2bLFvN7JkyfznbNv3z554IEHZM2aNSaYad+ktLQ0CQ+/opU5AACAS1xyItDlKVRKSoq8+eabZlh7WcrNzTVrgXXs2FFeffXVfPdrrdMtt9wi8fHxsm7dOjl06JAMGjRIKlSoIE8++WSZlhUAAASnYleFaG2KE3QleDV//vwC73/vvffkyy+/lFWrVpllNFq1aiVTp06VRx99VCZNmlRo/yPtm6Sbly62CgAA3KnYwefOO++Udu3amUDh7+mnn5ZNmzbJn//8Z3HC+vXr5brrrgtYO6xHjx6m6Wvr1q3SunXrAh+nTWHeUFWWzmadKPPXBIIZnwkAQRl8tBOz1qDk1atXL5k+fbo45fDhw/kWTPXu632FGT9+vIwePTqgxicxMVFK2851ztScAQBgs2IHn9OnTxfYbKR9aYrbTDRu3DiZNm1akeds27atVEeLRUZGmq2sNeyUIpWiy7afFBDsNT58IQAQdMFHm5OWLFkiEyZMCDi+ePFiadq0abGea8yYMT84E3RycvIlPZd2at64cWPAsSNHjvjuCzYaeirHxDpdDAAArFLs4PP444/LHXfcIV9//bXcdNNN5tjq1atl4cKF8pe//KVYzxUXF2e2kqCjvXTIu84qXbNmTXNs5cqVUrVq1WIHMgAA4E7FDj633nqrLFu2zAwR16BTsWJFadmypbz//vtmFffSonP06MrveqtD13U+H9WgQQMzZ0/37t1NwLnnnntMR2vt1/PYY49JamqqI01ZAAAg+FzWzH46X45uSvv1LFq0SMaOHSubN282oaQ0aNPaggULfPveUVo6vL5r165mQsW3337bjOLS2p+oqCgzgeGUKVNKpTwAACD0XPaUxjq6SycSfOONN6ROnTqm+Ss9PV1Ki87fU9gcPl716tWTv//976VWBgAAYFHw0eYjDR8aeLSmp2/fvmbyP236oh8NAABwzers2rencePG8tlnn8msWbPk4MGDMnv27NItHQAAgBM1PitWrJCRI0eaPjQNGzYsyTIAAAAEV43PP//5Tzl16pS0adNG2rdvLy+88IIcO3asdEsHAADgRPDp0KGD/OEPfzCrnt9///1mwkLt1Hzx4kUzX46GIgAAAFcEHy8dJj506FBTA/T555+b2ZefeuopM2ngbbfdVjqlBAAAcCL4+NPOzjpZ4P79+81cPgAAAK4NPl46eWDv3r1l+fLlJfF0AAAAwRt8AAAAQgHBBwAAWIPgAwAArEHwAQAA1iD4AAAAaxB8AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg+ADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAaBB8AAGANgg8AALAGwQcAAFiD4AMAAKwR7nQBbHU264TTRQCCCp8JAGWB4FPGoqKiJLxCBdm5bk1ZvzQQ9PSzoZ8RACgtBJ8yVq1aNRk9apScOXOmrF8aBcjMzJSlS5dK3759JS4ujvfIYRp69DMCXKmLFy/KgQMHzM96W7t2bSlXjt4dIPg4Qi/sXNyDi4aehIQEp4sBoAR88cUXsmLFCjlx4v93KVi2bJl8+OGH0qtXL2nevDnvseWIvwAAV4WeRYsWSa1ateTOO+80x/RW9/W43g+7EXwAAK5p3tKansaNG8uAAQPkwoUL5rje6r4e1/v1PNiLPj4AgAC5ubmm/1uo0b482rylAefpp5+W06dP+5q6Vq1aJc2aNZPt27fLJ598EnJN29ocHxER4XQxXCFkgs8TTzwh77zzjmzZssX84588eTLfOWFhYfmOadVmv379yqiUABD6NPSkp6dLqNqwYUO+YxqCPv74Y18QCjWpqakhF9aCVXgofQPp06ePdOzYUV599dVCz5s3b5707NnTt08nYgAofu2C/qENNRkZGbJ8+XLzc7169cxWvnx509S1d+9es6nbbrtNEhMTJZQw6tTC4DN58mRzO3/+/CLP06ATHx9fRqUCAPfRWvVQrF3wThPibRXwBh0VHR1tjuuX6JiYmJD8/VAyXNe5Wb+lxMbGSrt27WTu3Lni8XiKPD8nJ0eys7MDNgBA6PEGHQ03WsvTu3dvGTdunLnVfT3ufx7sFDI1PpdiypQpctNNN0mlSpXkvffekwcffNC0644cObLQx6SlpflqkwAAocs7Wku//J4/fz6gL0/16tWlRo0acvz4cUZ1Wc7R4KNJfNq0aUWes23bNmnSpMklPd/jjz/u+7l169am2vOZZ54pMviMHz9eRo8e7dvXGp9Qa/sFAIj50qvCw8PNdX/fvn1y6tQpqVKlitStW1d+//vfB5wHOzkafMaMGSNDhgwp8pzk5OTLfv727dvL1KlTTXNWZGRkgefo8cLuAwCEjsqVK5vbw4cPy+uvvy4pKSnmi/ORI0fMvh73Pw92Cne6l3pp9lTXoe9avUmwAQD30w7MXt98843s2LHDt1+hQoUCz4N9QqaPj1ZZfvvtt+ZWO6lpqFENGjQw6f1vf/ubSfUdOnSQq666SlauXClPPvmkjB071umiAwDKQFJSkvmyq01Z2tXBf743/Tuhx8+ePWvOg71CJvhMmDBBFixYENCHR61Zs0a6du1q0rxOuDVq1CgzkksD0YwZM2T48OEOlhoAUFZ09XVdiFQnrm3UqJF07tzZ/G04d+6cfPXVV2br378/q7RbLszzQ+O9LaOdm7UaNCsrS6pWrep0cVAGU9xrYGZWVMC9q7MrrQlidXZ3u9S/3yFT4wMAwKVo3ry56dSsS1doFwmdsFC7QehoL4D/CwAArq/xWb9+PTU+MAg+AABXhR5vH58bb7wxoI+PHtc+PlojBHsRfAAArpm5WWt66tSpY0b5+g9n13Uc9bje37RpUzo4W4zgAwBwhT179pjmLd20j0+/fv2kVq1aJgR98MEHsn37dt95VzI5LkKb6xYpBQDYSUfzKG3mGjhwoFmmQiew1Vvd1+P+58FOBB8AgCvopIWqWbNm+ZqydF+buPzPg50IPgAAV4iKijK3W7duzbcCu+7rcf/zYCeCDwDAFbxrcOkIrj/96U9miSNdpFpvdX/nzp0B58FOdG4GALhurS5diX3OnDkBo7oSEhJYqwsEHwCA+9bqaty4ccBaXVrbo8PbWasL1PgAAFxDJyfUcPP3v//dN3xdaU0QkxdC0ccHAOB6rMcNL2p8AACuW7JCm7ryTmDIkhVQ1PgAAFy1ZIWGngEDBsj58+dNc5fe6r4e1/vzDnWHXajxAQC4asmKtm3bysyZMwNWZ9c+Ptdff70JQixZYTeCDwDAFU6dOmVu33vvPbNWl47qCg8PNzU+OrfPypUrA86DnQg+AABX8M7IHBcXZ+bx8R/VpfP4xMbGyrFjx5i52XIEHwCAq2RmZubr3LxmzRozjw9A52YAgCv4N2GFhYUF3Oe/T1OX3ajxAQC4gnfV9Xbt2pmZmv2XrNDOzXp848aNrM5uOYIPAMBVfXxOnjwpo0aNMouTau1OlSpVpG7dumahUv/zYCeaugAArludfeHChWZEl47u0lvd1+P+58FO1PgAAFy3OvuhQ4dYnR0FosYHAOCq1dkPHDggZ8+ezdf/R4/r/Xoe7MW/PgDA9YuSskgpvAg+AABXrdWVkJCQrwOz7utx1uoCfXwAAK5aq0s37dTcv3//gNXZvTM5s1aX3ajxAQC4QlZWlrlt1KiRDBw40Axhj4yMNLe637Bhw4DzYCeCDwDAVRMYNmvWLF8HZt3X4/7nwU4EHwCAK3j79WzdutX09/Gn+19++WXAebATwQcA4LoJDHWWZp25OScnx9zqPhMYQtG5GQDgugkMDx8+nG+tLh3VpfP76HmwF8EHAOCqCQwXLVpkOjh37txZKlSoIOfOnTO1PbrpSC8mMLQbwQcA4BrNmzc34Ubn69mxY0dAjY8e1/tht5AIPjrnwtSpU+X999831Zd16tQxQxN/+9vfSkREhO+8zz77TFJTU2XTpk0SFxcnv/rVr+SRRx5xtOwAgLKl4aZp06bmb4d3dXZt3qKmByETfHTSKe2R/9JLL0mDBg3kiy++kOHDh5shic8++6w5Jzs7W7p37y7dunUz7bqff/65DB06VKpVqyb33Xef078CAKAMachJTk7mPUdoBp+ePXuazUv/Z9YqzBdffNEXfF5//XXJzc2VuXPnmlogna9hy5YtMmPGjCKDj/b4181LAxQAAHCnkB3OrjNvxsTE+PbXr18vP/7xjwOavnr06GECkk5fXpi0tDQzBNK7JSYmlnrZAQCAM0Iy+OzatUtmz54t999/v++Y9v3RNVn8eff1vsKMHz/ehCjvlpGRUYolBwAA1gafcePGSVhYWJGbd1E5rwMHDphmrz59+ph+PldK13GpWrVqwAYAANzJ0T4+Y8aMkSFDhhR5jn/ntIMHD0pKSop06tRJXn755YDz4uPjzQq8/rz7eh8AAICjwUeHnOt2KbSmR0NPmzZtZN68efmGJXbs2NEMb9eJqnTCKrVy5Upp3Lixmb8BAAAgJPr4aOjp2rWr1K1b14ziyszMNP12/PvuDBgwwHRsHjZsmFmgbsmSJfLcc8/J6NGjHS07AAAIHiExnF1rbrRDs25XX311wH0ej8fc6ois9957z0xgqLVCsbGxMmHCBObwAQAAoRV8tB/QD/UFUi1atJC1a9eWSZkAAEDoCYmmLgAAgJJA8AEAANYg+AAAAGsQfAAAgDUIPgAAwBoEHwAAYA2CDwAAsAbBBwAAWIPgAwAArEHwAQAA1iD4AAAAaxB8AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg+ADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAaBB8AAGANgg8AALAGwQcAAFiD4AMAAKxB8AEAANYg+AAAAGsQfAAAgDUIPgAAwBoEHwAAYA2CDwAAsAbBBwAAWIPgAwAArEHwAQAA1giJ4LNnzx4ZNmyY1K9fXypWrCjXXHONTJw4UXJzcwPOCQsLy7dt2LDB0bIDAIDgES4hYPv27XLx4kV56aWXpEGDBvLFF1/I8OHD5cyZM/Lss88GnLtq1Spp1qyZb79GjRoOlBgAAASjkAg+PXv2NJtXcnKy7NixQ1588cV8wUeDTnx8vAOlBAAAwS4kgk9BsrKyJCYmJt/x2267Tb7//ntp1KiRPPLII2a/KDk5OWbzys7OLpXyuo02M2ZmZkqo8/4Obvhd4uLiJCIiwuliAEBQC8ngs2vXLpk9e3ZAbU/lypVl+vTpcsMNN0i5cuXkjTfekN69e8uyZcuKDD9paWkyefLkMiq5e2hQSE9PF7dYunSphLrU1FRJSEhwuhgAENTCPB6Px6kXHzdunEybNq3Ic7Zt2yZNmjTx7R84cEC6dOkiXbt2lVdeeaXIxw4aNEh2794ta9euLVaNT2JioqlRqlq1arF+H5u4pcbHTajxAWCz7OxsiY6O/sG/347W+IwZM0aGDBlS5Dnan8fr4MGDkpKSIp06dZKXX375B5+/ffv2snLlyiLPiYyMNBuKR5tUqF0AAISacKe/oep2KbSmR0NPmzZtZN68eaY564ds2bJFateuXQIlhRvpSEGdBuHUqVNSpUoVSUpKuqT/rwAAoSsk+vho6NGmrXr16pl+Pf5NLN4RXAsWLDC1EK1btzb7b775psydO/cHm8NgJ50SYcWKFXLixAnfserVq0uvXr2kefPmjpYNAGB58NHmKu3QrNvVV18dcJ9/F6WpU6fK3r17JTw83PQLWrJkifziF79woMQI9tCzaNEiady4sdx1111Sq1YtOXLkiHzwwQfmeP/+/Qk/AOBSjnZuDuXOUQjd5i0d/adhZ+DAgQFNW3rfn/70JxOCtP8ZzV4A4L6/33RogFW0T482b2nTad5go/t6XO/X8wAA7kPwgVW0I7PSGp+CeI97zwMAuAvBB1bR0VtKm7MK4j3uPQ8A4C4EH1hFh6zr6C3tyKx9evzpvh7X+/U8AID7EHxgFe3Ho0PWdZFb7ci8b98+M3O33uq+Htf76dgMAO7EqK48GNVlB+bxAQB3CYklKwCn6CSFOtfThg0b5Ntvv5WYmBjp0KGDmQMKAOBeXOVhpYJqfNavX8/MzQDgcgQfWIeZmwHAXnRuhlV05JbW9OhyFTpzc926dSUyMtLc6r4e1/vzjvgCALgDwQdWYeZmALAbwQdWYeZmALAbwQdWYeZmALAbwQdWYeZmALAbwQdWYeZmALAbMzfnwczNdmDmZgBwF2ZuBn5g5uamTZuaUV7a4Vn7/mgzGGt0AYC7MYEhrKUhJzk52eliAADKEH18AACANQg+AADAGgQfAABgDYIPAACwBsEHAABYg+ADAACsQfABAADWIPgAAABrEHwAAIA1mLk5D4/H41vzAwAAhAbv323v3/HCEHzy0HWbVGJiYmn92wAAgFL8Ox4dHV3o/azOnsfFixfl4MGDZtHKsLCw0vp3QRB9Q9CQm5GRIVWrVnW6OABKEJ9vu3g8HhN66tSpU+SC09T45KFv1tVXX13a/z4IMhp6CD6AO/H5tkd0ETU9XnRuBgAA1iD4AAAAaxB8YLXIyEiZOHGiuQXgLny+URA6NwMAAGtQ4wMAAKxB8AEAANYg+AAAAGsQfAA/OmnlsmXLeE+AEjR//nypVq2ate/ppEmTpFWrVk4XA/9F8EHQGTJkiAkgulWoUEHq168vjzzyiHz//fdOFw2wwvr166V8+fJyyy23FPuxSUlJMmvWrIBjd911l3z11VdSmj744ANzzTh58qQEm7Fjx8rq1audLgb+i+CDoNSzZ085dOiQfPPNNzJz5kx56aWXzLBzAKXv1VdflV/96lfy0UcfmSV8rlTFihWlZs2a4ja5ubmXdF7lypWlRo0apV4eXBqCD4J2/o34+Hizjlbv3r2lW7dusnLlSnPf8ePHpX///pKQkCCVKlWS6667ThYtWhTw+K5du8rIkSNNTVFMTIx5Lq1u9rdz50758Y9/LFdddZU0bdrU9/z+Pv/8c7npppvMhVsvXPfdd5+cPn06oHZKy/fkk09KrVq1THX+lClT5Pz58/Kb3/zGvLYugTJv3rxSe6+AkqT/fy9ZskQeeOABU+OjzVR5/e1vf5O2bduaz05sbKz8/Oc/933u9u7dK6NGjfLV2uZt6tKaHz2+ffv2gOfULzjXXHONb/+LL76QXr16mdCgn6177rlHjh07dtm/V05Ojql50etGVFSUtG/f3tQSeV3qdeWhhx6Shx9+2PzePXr08NU0aY3O9ddfbx7bqVMn2bFjR6FNXd7rxrPPPiu1a9c215bU1FQ5d+6c7xz94qfvv157tNZ74cKFBdamofgIPgh6egFct26dREREmH1t8mrTpo2888475j4NI3pR3LhxY8DjFixYYC5wH3/8sTz99NMmkHjDjS5Ge8cdd5jn1PvnzJkjjz76aMDjz5w5Yy5s1atXl02bNsmf//xnWbVqlbnw+Xv//ffNt2L9djxjxgxTM/Wzn/3MPE6fe8SIEXL//ffL/v37S/29Aq7U0qVLpUmTJtK4cWMZOHCgzJ071yz+6KWfOw06P/3pT+XTTz81f/DbtWtn7nvzzTdN0NfPmv7h1i2vRo0amYDw+uuvBxzX/QEDBpiftblKv3C0bt1aPvnkE3n33XflyJEj0rdv38v+vfRzq014ixcvls8++0z69Oljapb1C1Bxryt63fjXv/5lrhtev/3tb2X69OmmvOHh4TJ06NAiy7NmzRr5+uuvza0+p4ZD/5A5aNAgc13RYPXGG2/Iyy+/LEePHr3s3x9+PECQGTx4sKd8+fKeqKgoT2RkpF5xPeXKlfP85S9/KfQxt9xyi2fMmDG+/S5dunhuvPHGgHPatm3refTRR83P//jHPzzh4eGeAwcO+O5fsWKFea233nrL7L/88sue6tWre06fPu0755133jFlOXz4sK+s9erV81y4cMF3TuPGjT2dO3f27Z8/f978LosWLbrCdwYofZ06dfLMmjXL/Hzu3DlPbGysZ82aNb77O3bs6Ln77rsLfbx+HmbOnBlwbN68eZ7o6Gjfvt5/zTXX+PZ37NhhPnvbtm0z+1OnTvV079494DkyMjLMOXpuQbSMev+JEyfy3bd3715zTfH/vKubb77ZM378+GJdV1q3bl3g665atSrgOqHHvvvuO7M/ceJET8uWLX33e68bem3w6tOnj+euu+4yP+v7oI/ftGmT7/6dO3eaY3nfWxQfNT4ISikpKbJlyxZTYzJ48GC599575c477zT3XbhwQaZOnWqqorUpSavC//GPf8i+ffsCnqNFixYB+1ql7P3GtG3bNtOMVqdOHd/9HTt2DDhfz2nZsqWpNfK64YYbTG2RfzV2s2bNpFy5/32UtFpey+alnUS1Kptvawh2+v+11nBok4/SmgvtmKx9frz0c3nzzTdf0ev069dP9uzZIxs2bPDV9vzoRz8yNU3qP//5j6kJ0c+2d/Pep7UkxaVN1nrd0Nom/+f88MMPfc93qdcVrRUqiP/1Rq81qqjPvF439Nrg/xjv+frvoO+9videDRo0MLXIuHLhJfAcQInTsKEfdKVV7RpA9OI7bNgweeaZZ+S5554zbd16kdJztc09b0dDHRHmT9vhNbSUtIJep6xeGyhJ+hnT/mn+Xwi0mUv73L3wwgsSHR1t+pxcKe1zp01Z2m+lQ4cO5lb7FPn3M7r11ltl2rRp+R7rDRXFoc+nIWPz5s0BYUNpwFGXel3x/yLkz/8z7+3bVNRnnmuEc6jxQdDT2pT/+7//k8cee0y+++4707Z+++23m/4HGoiSk5OLPVT22muvlYyMjIA+CN5vn/7n6DdP7evjpa+t5dH+D4CbaOB57bXXTD8VrdXxbvoZ0CDk7eirNRtFDc3W/i9ae/JD7r77btOJWvvd6OhNrQXy0pqOrVu3ms68+gXIfysseBRF+wppmbRGJe/zaQhTJXFdKSl6fdF/D+1D5bVr1y45ceKEI+VxG4IPQoJ2RNRvaunp6dKwYUPTSVk7PGtzlHYc1o6PxaGjxLTaW5vR9MK+du1a0zkx74VZR63oOdrZUavedYivdnjU5izATd5++23zh1VrVZs3bx6waTOzt7lLO+9rCNJb/fxpM5J/zYyGFe3of+DAgSJHYengglOnTpmaHm3a9q9l0hFO3377rWly04EF2hylzU7a5P1DoUrLkze46WddP8/aYVg7YO/evds06aWlpZnOzKokrislRZv19BqlHay1nBqA9GetbfPWJuHyEXwQErS9W0dl6OisMWPGmG+EOuJKh5fqNzYdGlocWmvz1ltvmRokHZHyy1/+Up544omAc3RYql5s9QKsQ3d/8YtfmL4NWuUPuI0GG/1jq81ZeWnw0dFKOhpKP3M6wnH58uVmiLY2WfmPfNIRXdp/R4emx8XFFfp6VapUMc1ZGkw0lPjTEKQ1MBpyunfvbpqetNlJh8T796criE5RoTU83s3bJ0enlNDgo9cPrVHRa4aGqrp165r7tUb5Sq8rJUlr3/QLlv4+Oopu+PDh5j3TL2O4MmHaw/kKnwMAAJQinQ5DB2TolBpX2rncdgQfAACCjM4Ppp2ytbZL+yLqZKzafKj9jvJ2jEbxMKoLAIAgo7M466AO7fitTVw6G7QO+yf0XDlqfAAAgDXo3AwAAKxB8AEAANYg+AAAAGsQfAAAgDUIPgAAwBoEHwAAYA2CD4CgMGTIELMOkXd1e52u/yc/+YnMnTu3WCvbz58/3yxt4ET5nVziAMClIfgACBo9e/Y0s9TqWk8rVqwwi1f++te/lp/97GdmtWoAuFIEHwBBIzIy0iwOmZCQYBaM1Jlr//rXv5oQpDU5asaMGWYa/6ioKLN20YMPPmim9lcffPCBWcE7KyvLV3s0adIkc98f//hHuf76680suPoaAwYMkKNHj/peW1cm18UydWFNXQVbV+vWhS29MjIypG/fvqY2KSYmRm6//XYT0JS+xoIFC0xZva+rZQEQfAg+AIKarv7dsmVLefPNN82+rs79/PPPy9atW03Y0DWNdB0jpdP6z5o1S6pWrWpqjnQbO3asbwmAqVOnmtXAly1bZkKLNk95Pf744/Lll1+akLVt2zZ58cUXJTY21vdYXbVbQ9PatWvNyuGVK1c2NVS5ubnmNTQUeWusdNOyAAg+rNUFIOg1adJEPvvsM/Pzww8/7DuelJQkv/vd72TEiBHy+9//XiIiIiQ6OtrUuGitjr+hQ4f6fk5OTjbhqW3btqa2SEPMvn37pHXr1qZWyPvcXkuWLDH9jF555RXz3Eprg7T2R2t2unfvbmqJcnJy8r0ugOBCjQ+AoOfxeHyBY9WqVXLzzTeb5jCtgbnnnnvk+PHjcvbs2SKfY/PmzXLrrbdK3bp1zeO6dOlijmvgUQ888IAsXrxYWrVqZWqQ1q1b53us1hLt2rXLPE5Dkm7a3PX999/L119/Xaq/O4CSRfABEPS06al+/fqmeUo7Ordo0ULeeOMNE2bS09PNOdrkVJgzZ86YpiptAtMVrjdt2iRvvfVWwON69eole/fulVGjRsnBgwdNuPI2k2mtUJs2bWTLli0B21dffWX6CgEIHTR1AQhq2ofn888/N4FEg442OU2fPt309VFLly4NOF+buy5cuBBwbPv27aZW6KmnnjIdotUnn3yS77W0Y/PgwYPN1rlzZ/nNb34jzz77rOlorc1dNWvWNOGpIAW9LoDgQ40PgKChfWQOHz4sBw4ckH//+9/y5JNPmtFTWsszaNAgadCggeloPHv2bPnmm2/MSK05c+YEPIf2zdEamtWrV8uxY8dME5g2b2kw8T5u+fLlpqOzvwkTJphRWdqkpR2n3377bbn22mvNfTraSzs6a1m0c/Pu3btN356RI0fK/v37fa+r/ZB27NhhXlfLCSAIeQAgCAwePNijlyTdwsPDPXFxcZ5u3bp55s6d67lw4YLvvBkzZnhq167tqVixoqdHjx6e1157zTzmxIkTvnNGjBjhqVGjhjk+ceJEc2zhwoWepKQkT2RkpKdjx46e5cuXm/s//fRTc//UqVM91157rXnemJgYz+233+755ptvfM956NAhz6BBgzyxsbHmOZKTkz3Dhw/3ZGVlmfuPHj3q+clPfuKpXLmyed41a9aU4bsH4FKF6X+cDl8AAABlgaYuAABgDYIPAACwBsEHAABYg+ADAACsQfABAADWIPgAAABrEHwAAIA1CD4AAMAaBB8AAGANgg8AALAGwQcAAIgt/h8QFLuNJhB1HgAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Take a random sample from the input data that is the same size as the dataframe with the selected molecules\n",
    "random_df = df.sample(len(res_df)).copy()\n",
    "random_df['Dataset'] = 'Random'\n",
    "# Label the active learning data\n",
    "res_df['Dataset'] = 'Active Learning'\n",
    "ax = sns.boxplot(x=\"Dataset\", y=\"Activity\", data=pd.concat([random_df,res_df]),color=\"lightblue\")\n",
    "ax.set_ylabel(\"Activity\");"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ae8cd85b",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0865c777",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
